Thread: Re: BUG #4204: COPY to table with FK has memory leak

Re: BUG #4204: COPY to table with FK has memory leak

From

Gregory Stark

Date:

28 May 2008, 15:23:04

[moving to -hackers]

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> "Tomasz Rybak" <bogomips@post.pl> writes:
>> I tried to use COPY to import 27M rows to table:
>> CREATE TABLE sputnik.ccc24 (
>>         station CHARACTER(4) NOT NULL REFERENCES sputnik.station24 (id),
>>         moment INTEGER NOT NULL,
>>         flags INTEGER NOT NULL
>> ) INHERITS (sputnik.sputnik);
>> COPY sputnik.ccc24(id, moment, station, strength, sequence, flags)
>> FROM '/tmp/24c3' WITH DELIMITER AS ' ';
>
> This is expected to take lots of memory because each row-requiring-check
> generates an entry in the pending trigger event list.  Even if you had
> not exhausted memory, the actual execution of the retail checks would
> have taken an unreasonable amount of time.  The recommended way to do
> this sort of thing is to add the REFERENCES constraint *after* you load
> all the data; that'll be a lot faster in most cases because the checks
> are done "in bulk" using a JOIN rather than one-at-a-time.

Hm, it occurs to me that we could still do a join against the pending event
trigger list... I wonder how feasible it would be to store the pending trigger
event list in a temporary table instead of in ram.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production
Tuning

Re: BUG #4204: COPY to table with FK has memory leak

From

Tom Lane

Date:

28 May 2008, 17:28:34

Gregory Stark <stark@enterprisedb.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> This is expected to take lots of memory because each row-requiring-check
>> generates an entry in the pending trigger event list.

> Hm, it occurs to me that we could still do a join against the pending event
> trigger list... I wonder how feasible it would be to store the pending trigger
> event list in a temporary table instead of in ram.

We could make that list spill to disk, but the problem remains that
verifying the rows one at a time will take forever.

The idea that's been kicked around occasionally is that once you get
past N pending events, throw them all away and instead queue a single
operation to do a bulk verify (just like initial establishment of the
FK constraint).  I'm not sure how to do the queue management for this
though.
        regards, tom lane

Re: BUG #4204: COPY to table with FK has memory leak

From

Gregory Stark

Date:

28 May 2008, 17:47:59

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <stark@enterprisedb.com> writes:
>> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>>> This is expected to take lots of memory because each row-requiring-check
>>> generates an entry in the pending trigger event list.
>
>> Hm, it occurs to me that we could still do a join against the pending event
>> trigger list... I wonder how feasible it would be to store the pending trigger
>> event list in a temporary table instead of in ram.
>
> We could make that list spill to disk, but the problem remains that
> verifying the rows one at a time will take forever.

Well I was thinking if we did a join between a temporary table and the fk
target then it wouldn't have to be a one-by-one operation. It could be a merge
join if the planner thought that was better. How to get accurate stats into
the planner at that point would be a missing detail though.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!

Re: BUG #4204: COPY to table with FK has memory leak

From

Simon Riggs

Date:

28 May 2008, 18:43:14

On Wed, 2008-05-28 at 16:28 -0400, Tom Lane wrote:
> Gregory Stark <stark@enterprisedb.com> writes:
> > "Tom Lane" <tgl@sss.pgh.pa.us> writes:
> >> This is expected to take lots of memory because each row-requiring-check
> >> generates an entry in the pending trigger event list.
> 
> > Hm, it occurs to me that we could still do a join against the pending event
> > trigger list... I wonder how feasible it would be to store the pending trigger
> > event list in a temporary table instead of in ram.
> 
> We could make that list spill to disk, but the problem remains that
> verifying the rows one at a time will take forever.
> 
> The idea that's been kicked around occasionally is that once you get
> past N pending events, throw them all away and instead queue a single
> operation to do a bulk verify (just like initial establishment of the
> FK constraint).  I'm not sure how to do the queue management for this
> though.

Neither of those approaches is really suitable. Just spilling to disk is
O(N) of the number of rows loaded, the second one is O(N) at least on
the number of rows (loaded + existing). The second one doesn't help
either since if the table was empty you'd have added the FK afterwards,
so we must assume there is already rows in there and in most cases rows
already loaded will exceed those being added by the bulk operation.

AFAICS we must aggregate the trigger checks. We would need a special
property of triggers that allowed them to be aggregated when two similar
checks arrived. We can then use hash aggregation to accumulate them. We
might conceivably need to spill to disk also, since the aggregation may
not always be effective. But in most cases the tables against which FK
checks are made are significantly smaller than the tables being loaded.
Once we have hash aggregated them, that is then the first part of a hash
join to the target table.

We certainly need a TODO item for "improve RI checks during bulk
operations".

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: BUG #4204: COPY to table with FK has memory leak

From

Gregory Stark

Date:

28 May 2008, 19:17:33

"Simon Riggs" <simon@2ndquadrant.com> writes:

> AFAICS we must aggregate the trigger checks. We would need a special
> property of triggers that allowed them to be aggregated when two similar
> checks arrived. We can then use hash aggregation to accumulate them. We
> might conceivably need to spill to disk also, since the aggregation may
> not always be effective. But in most cases the tables against which FK
> checks are made are significantly smaller than the tables being loaded.
> Once we have hash aggregated them, that is then the first part of a hash
> join to the target table.

Well we can't aggregate them as they're created because later modifications
could delete or update the original records. The SQL spec requires that FK
checks be effective at the end of the command. 

I admit off the top of my head I can't actually come up with any situations
which would be covered by the spec. All the instances I can think of involve
either Postgres's UPDATE FROM or plpgsql functions or some other postgres
specific functionality. But I do seem to recall there were some situations
where it mattered.

But we could aggregate them when it comes time to actually check them. Or we
could hash the FK keys and scan the event list. Or we could sort the two and
merge join them....

> We certainly need a TODO item for "improve RI checks during bulk
> operations".

I have a feeling it's already there. Hm. There's a whole section on RI
triggers but the closest I see is this, neither of the links appear to refer
to bulk operations:
Optimize referential integrity checks

http://archives.postgresql.org/pgsql-performance/2005-10/msg00458.phphttp://archives.postgresql.org/pgsql-hackers/2007-04/msg00744.php

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!

Re: BUG #4204: COPY to table with FK has memory leak

From

Tom Lane

Date:

28 May 2008, 19:42:36

Gregory Stark <stark@enterprisedb.com> writes:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
>> We certainly need a TODO item for "improve RI checks during bulk
>> operations".

> I have a feeling it's already there. Hm. There's a whole section on RI
> triggers but the closest I see is this, neither of the links appear to refer
> to bulk operations:

>  Optimize referential integrity checks
>  http://archives.postgresql.org/pgsql-performance/2005-10/msg00458.php
>  http://archives.postgresql.org/pgsql-hackers/2007-04/msg00744.php

No, both of those are talking about the same thing, ie, (1) making the
are-the-keys-unchanged optimization work when NULLs are present,
and (2) not testing for this case twice.

There's an entry in the Triggers section

* Add deferred trigger queue file
 Right now all deferred trigger information is stored in backend memory.  This could exhaust memory for very large
triggerqueues. This item involves dumping large queues into files.

but as already noted, this is a pretty myopic answer (at least for
RI triggers).
        regards, tom lane

Re: BUG #4204: COPY to table with FK has memory leak

From

Simon Riggs

Date:

29 May 2008, 02:56:32

On Wed, 2008-05-28 at 18:17 -0400, Gregory Stark wrote:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
> 
> > AFAICS we must aggregate the trigger checks. We would need a special
> > property of triggers that allowed them to be aggregated when two similar
> > checks arrived. We can then use hash aggregation to accumulate them. We
> > might conceivably need to spill to disk also, since the aggregation may
> > not always be effective. But in most cases the tables against which FK
> > checks are made are significantly smaller than the tables being loaded.
> > Once we have hash aggregated them, that is then the first part of a hash
> > join to the target table.
> 
> Well we can't aggregate them as they're created because later modifications
> could delete or update the original records. The SQL spec requires that FK
> checks be effective at the end of the command. 

Well, thats what we need to do. We just need to find a way...

Currently, we store trigger entries by htid. I guess we need to
aggregate them on the actual values looked up.

The SQL spec also says that the contents of the FK check table should be
taken as at the start of the command, so we should be safe to aggregate
the values prior to the check.

As already suggested in work on Read Only Tables, we could optimise them
away to being constraint checks.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: BUG #4204: COPY to table with FK has memory leak

From

Hannu Krosing

Date:

29 May 2008, 03:45:10

On Wed, 2008-05-28 at 22:45 +0100, Simon Riggs wrote:
> On Wed, 2008-05-28 at 16:28 -0400, Tom Lane wrote:
> > Gregory Stark <stark@enterprisedb.com> writes:
> > > "Tom Lane" <tgl@sss.pgh.pa.us> writes:
> > >> This is expected to take lots of memory because each row-requiring-check
> > >> generates an entry in the pending trigger event list.
> > 
> > > Hm, it occurs to me that we could still do a join against the pending event
> > > trigger list... I wonder how feasible it would be to store the pending trigger
> > > event list in a temporary table instead of in ram.
> > 
> > We could make that list spill to disk, but the problem remains that
> > verifying the rows one at a time will take forever.
> > 
> > The idea that's been kicked around occasionally is that once you get
> > past N pending events, throw them all away and instead queue a single
> > operation to do a bulk verify (just like initial establishment of the
> > FK constraint).  I'm not sure how to do the queue management for this
> > though.
> 
> Neither of those approaches is really suitable. Just spilling to disk is
> O(N) of the number of rows loaded, the second one is O(N) at least on
> the number of rows (loaded + existing). The second one doesn't help
> either since if the table was empty you'd have added the FK afterwards,
> so we must assume there is already rows in there and in most cases rows
> already loaded will exceed those being added by the bulk operation.
> 
> AFAICS we must aggregate the trigger checks. We would need a special
> property of triggers that allowed them to be aggregated when two similar
> checks arrived. We can then use hash aggregation to accumulate them. We
> might conceivably need to spill to disk also, since the aggregation may
> not always be effective. 

Can't we just do the checks for the FKs accumulated at the point they
don't fit in memory, instead of spilling to disk ?

> But in most cases the tables against which FK
> checks are made are significantly smaller than the tables being loaded.
> Once we have hash aggregated them, that is then the first part of a hash
> join to the target table.
> 
> We certainly need a TODO item for "improve RI checks during bulk
> operations".

Agreed.

----------------
Hannu

Re: BUG #4204: COPY to table with FK has memory leak

From

Decibel!

Date:

03 June 2008, 17:43:41

On May 28, 2008, at 1:22 PM, Gregory Stark wrote:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> "Tomasz Rybak" <bogomips@post.pl> writes:
>>> I tried to use COPY to import 27M rows to table:
>>> CREATE TABLE sputnik.ccc24 (
>>>         station CHARACTER(4) NOT NULL REFERENCES  
>>> sputnik.station24 (id),
>>>         moment INTEGER NOT NULL,
>>>         flags INTEGER NOT NULL
>>> ) INHERITS (sputnik.sputnik);
>>> COPY sputnik.ccc24(id, moment, station, strength, sequence, flags)
>>> FROM '/tmp/24c3' WITH DELIMITER AS ' ';
>>
>> This is expected to take lots of memory because each row-requiring- 
>> check
>> generates an entry in the pending trigger event list.  Even if you  
>> had
>> not exhausted memory, the actual execution of the retail checks would
>> have taken an unreasonable amount of time.  The recommended way to do
>> this sort of thing is to add the REFERENCES constraint *after* you  
>> load
>> all the data; that'll be a lot faster in most cases because the  
>> checks
>> are done "in bulk" using a JOIN rather than one-at-a-time.
>
> Hm, it occurs to me that we could still do a join against the  
> pending event
> trigger list... I wonder how feasible it would be to store the  
> pending trigger
> event list in a temporary table instead of in ram.


Related to that, I really wish that our statement-level triggers  
provided NEW and OLD recordsets like some other databases do. That  
would allow for RI triggers to be done on a per-statement basis, and  
they could aggregate keys to be checked.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828