Thread: Why would this use 600Meg of VM?

Why would this use 600Meg of VM?

From
Larry Rosenman
Date:
Can one of you knowledgeable people tell me why current CVS as of 
a week ago would have the backend running this query grow to 
600 meg+?


INSERT into traffic_summary
SELECT asn,protocol,
cast(sum(pkts_src) as float) as pkts_src,
cast(sum(pkts_dst) as float) as pkts_dst,
cast(sum(bytes_src) as float) as bytes_src,
cast(sum(bytes_dst) as float) as bytes_dst,
cast(sum(secs_src) as float)  as secs_src,
cast(sum(secs_dst) as float) as secs_dst,
min(early) as early,
max(late) as late 
FROM traffic 
WHERE early between '2001-06-01 00:00:00'::timestamp and                   '2001-06-18 23:59:59'::timestamp 
GROUP BY asn,protocol,date_part('epoch',early)/60/60;
-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749


Re: Why would this use 600Meg of VM?

From
Tom Lane
Date:
Larry Rosenman <ler@lerctr.org> writes:
> Can one of you knowledgeable people tell me why current CVS as of 
> a week ago would have the backend running this query grow to 
> 600 meg+?

Sounds like there's still a memory leak in there somewhere, but the
query looks fairly harmless.  Could we see enough info to reproduce
this?  (Table declarations, explain output, etc)  Another useful
attack would be to let the query run awhile, then set a breakpoint
at sbrk().  Stack traces from the first few hits of the breakpoint
would give a pretty good indication of where the leak is, probably.
        regards, tom lane


Re: Why would this use 600Meg of VM?

From
Larry Rosenman
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [010622 11:55]:
> Larry Rosenman <ler@lerctr.org> writes:
> > Can one of you knowledgeable people tell me why current CVS as of 
> > a week ago would have the backend running this query grow to 
> > 600 meg+?
> 
> Sounds like there's still a memory leak in there somewhere, but the
> query looks fairly harmless.  Could we see enough info to reproduce
> this?  (Table declarations, explain output, etc)  Another useful
> attack would be to let the query run awhile, then set a breakpoint
> at sbrk().  Stack traces from the first few hits of the breakpoint
> would give a pretty good indication of where the leak is, probably.
> 
>             regards, tom lane

neteng@tide.iadfw.net$ psql  traffic_analysis
Welcome to psql, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms      \h for help with SQL commands      \? for help on internal slash commands
    \g or terminate with semicolon to execute query      \q to quit
 

traffic_analysis=# analyze traffic;
ANALYZE
traffic_analysis=# \i traffic_sum.sql 
psql:traffic_sum.sql:15: NOTICE:  QUERY PLAN:

Subquery Scan *SELECT*  (cost=8471740.01..8994414.10 rows=1900633
width=72) ->  Aggregate  (cost=8471740.01..8994414.10 rows=1900633 width=72)       ->  Group
(cost=8471740.01..8614287.49rows=19006331
 
width=72)             ->  Sort  (cost=8471740.01..8471740.01 rows=19006331
width=72)                   ->  Seq Scan on traffic  (cost=0.00..615601.86
rows=19006331 width=72)

EXPLAIN
traffic_analysis=# 


neteng@tide.iadfw.net$ cat traffic_sum.sql
EXPLAIN
INSERT into traffic_summary
SELECT asn,protocol,
cast(sum(pkts_src) as float) as pkts_src,
cast(sum(pkts_dst) as float) as pkts_dst,
cast(sum(bytes_src) as float) as bytes_src,
cast(sum(bytes_dst) as float) as bytes_dst,
cast(sum(secs_src) as float)  as secs_src,
cast(sum(secs_dst) as float) as secs_dst,
min(early) as early,
max(late) as late 
FROM traffic 
WHERE early between '2001-06-01 00:00:00'::timestamp and                   '2001-06-18 23:59:59'::timestamp 
GROUP BY asn,protocol,date_part('epoch',early)/60/60;
neteng@tide.iadfw.net$ 

What else? 

Failing a way to actually get this query to run, how would you suggest
aggregating the data down to 1 hour summaries?

neteng@tide.iadfw.net$ psql  traffic_analysis
Welcome to psql, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms      \h for help with SQL commands      \? for help on internal slash commands
    \g or terminate with semicolon to execute query      \q to quit
 

traffic_analysis=# \d traffic                Table "traffic"Attribute |           Type           | Modifier 
-----------+--------------------------+----------asn       | integer                  | protocol  | integer
    | pkts_src  | bigint                   | pkts_dst  | bigint                   | bytes_src | bigint
| bytes_dst | bigint                   | secs_src  | bigint                   | secs_dst  | bigint                   |
early    | timestamp with time zone | late      | timestamp with time zone | 
 
Index: traffic_early

traffic_analysis=# \d traffic_summary            Table "traffic_summary"Attribute |           Type           | Modifier

-----------+--------------------------+----------asn       | integer                  | protocol  | integer
    | pkts_src  | double precision         | pkts_dst  | double precision         | bytes_src | double precision
| bytes_dst | double precision         | secs_src  | double precision         | secs_dst  | double precision         |
early    | timestamp with time zone | late      | timestamp with time zone | 
 

traffic_analysis=# 
traffic_analysis=# \d traffic_early       Index "traffic_early"Attribute |           Type           
-----------+--------------------------early     | timestamp with time zone
btree

traffic_analysis=# 

LER

-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749


Re: Why would this use 600Meg of VM?

From
Larry Rosenman
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [010622 12:31]:
> Larry Rosenman <ler@lerctr.org> writes:
> > What else? 
> 
> If you don't want to do the debugger work yourself, could you send me
> enough of the data to let me reproduce the problem?
how much data do you need?  It's multi hundred megs.  I can probably
get permission to give you a login on tide if that would be easier?

LER

> 
>             regards, tom lane

-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749


Re: Why would this use 600Meg of VM?

From
Tom Lane
Date:
Larry Rosenman <ler@lerctr.org> writes:
> What else? 

If you don't want to do the debugger work yourself, could you send me
enough of the data to let me reproduce the problem?
        regards, tom lane


Re: Why would this use 600Meg of VM?

From
Tom Lane
Date:
Larry Rosenman <ler@lerctr.org> writes:
> Can one of you knowledgeable people tell me why current CVS as of 
> a week ago would have the backend running this query grow to 
> 600 meg+?

The answer: the query has nothing to do with it.  However, the
deferred triggers you have on the target relation have a lot to do
with it.  It's all deferred-trigger-event storage.
        regards, tom lane


Re: Why would this use 600Meg of VM?

From
Philip Warner
Date:
At 01:06 24/06/01 -0400, Tom Lane wrote:
>
>The answer: the query has nothing to do with it.  However, the
>deferred triggers you have on the target relation have a lot to do
>with it.  It's all deferred-trigger-event storage.

Would it be worth using a local (system) temporary table for this sort of
thing?


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: Why would this use 600Meg of VM?

From
Larry Rosenman
Date:
* Philip Warner <pjw@rhyme.com.au> [010624 00:46]:
> At 01:06 24/06/01 -0400, Tom Lane wrote:
> >
> >The answer: the query has nothing to do with it.  However, the
> >deferred triggers you have on the target relation have a lot to do
> >with it.  It's all deferred-trigger-event storage.
> 
> Would it be worth using a local (system) temporary table for this sort of
> thing?
I this is an FK check, that I can probably turn off.  

I wonder if there is a better way? 
> 
> 
> ----------------------------------------------------------------
> Philip Warner                    |     __---_____
> Albatross Consulting Pty. Ltd.   |----/       -  \
> (A.B.N. 75 008 659 498)          |          /(@)   ______---_
> Tel: (+61) 0500 83 82 81         |                 _________  \
> Fax: (+61) 0500 83 82 82         |                 ___________ |
> Http://www.rhyme.com.au          |                /           \|
>                                  |    --________--
> PGP key available upon request,  |  /
> and from pgp5.ai.mit.edu:11371   |/

-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749


Re: Why would this use 600Meg of VM?

From
Bruce Momjian
Date:
> At 01:06 24/06/01 -0400, Tom Lane wrote:
> >
> >The answer: the query has nothing to do with it.  However, the
> >deferred triggers you have on the target relation have a lot to do
> >with it.  It's all deferred-trigger-event storage.
> 
> Would it be worth using a local (system) temporary table for this sort of
> thing?

Jan intially wanted to store large FK events in a file when they got too
big but never completed it.

The TODO list has:
* Add deferred trigger queue file (Jan)

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026