Thread: Join over two tables of 50K records takes 2 hours

Join over two tables of 50K records takes 2 hours

From
Svetlin Manavski
Date:
Hi all,
I am running 9.03 with the settings listed below. I have a prohibitively slow query in an application which has an overall good performance:

select
    cast (SD.detectorid as numeric),
    CAST( ( (SD.createdtime - 0 )/ 18000000000000::bigint ) AS numeric) as timegroup,
    sum(datafromsource)+sum(datafromdestination) as numbytes,
    CAST ( sum(packetsfromsource)+sum(packetsfromdestination) AS numeric) as numpackets
from
    appqosdata.icmptraffic SD , appqosdata.icmptraffic_classes CL
where
    SD.detectorid >= 0 
    and CL.detectorid = SD.detectorid
    and CL.sessionid = SD.sessionid 
    and CL.createdtime = SD.createdtime 
    and SD.detectorid = 1
    and SD.createdtime >= 1317726000000000000::bigint and SD.createdtime < 1318326120000000000::bigint
    and CL.sessionid < 1318326120000000000::bigint
    and CL.classid = 1
group by
    SD.detectorid, timegroup

appqosdata.icmptraffic and appqosdata.icmptraffic_classes are both partitioned.

CREATE TABLE appqosdata.icmptraffic
(
    detectorid smallint not null default(0), -- references appqosdata.detectors(id),
    sessionid bigint not null,
    createdtime bigint not null,

...
   --primary key(detectorid, sessionid, createdtime)  defined in the children tables
);

CREATE TABLE appqosdata.icmptraffic_classes
(
    detectorid smallint not null,
    sessionid bigint not null,
    createdtime bigint not null,

    classid integer not null

   -- definitions in the children tables:
    --primary key(detectorid, sessionid, createdtime, classid)
    --foreign key(detectorid, sessionid, createdtime)  references appqosdata.icmptraffic(detectorid, sessionid, createdtime),
    --foreign key(classid) references appqosdata.display_classes(id),
);

"HashAggregate  (cost=154.24..154.28 rows=1 width=34) (actual time=7594069.940..7594069.983 rows=19 loops=1)"
"  Output: (sd.detectorid)::numeric, ((((sd.createdtime - 0) / 18000000000000::bigint))::numeric), (sum(sd.datafromsource) + sum(sd.datafromdestination)), ((sum(sd.packetsfromsource) + sum(sd.packetsfromdestination)))::numeric, sd.detectorid"
"  ->  Nested Loop  (cost=0.00..154.23 rows=1 width=34) (actual time=0.140..7593838.258 rows=50712 loops=1)"
"        Output: sd.detectorid, sd.createdtime, sd.datafromsource, sd.datafromdestination, sd.packetsfromsource, sd.packetsfromdestination, (((sd.createdtime - 0) / 18000000000000::bigint))::numeric"
"        Join Filter: ((sd.sessionid = cl.sessionid) AND (sd.createdtime = cl.createdtime))"
"        ->  Append  (cost=0.00..88.37 rows=7 width=18) (actual time=0.063..333.355 rows=51776 loops=1)"
"              ->  Seq Scan on appqosdata.icmptraffic_classes cl  (cost=0.00..37.48 rows=1 width=18) (actual time=0.013..0.013 rows=0 loops=1)"
"                    Output: cl.detectorid, cl.sessionid, cl.createdtime"
"                    Filter: ((cl.sessionid < 1318326120000000000::bigint) AND (cl.detectorid = 1) AND (cl.classid = 1))"
"              ->  Index Scan using icmptraffic_classes_10_pkey on appqosdata.icmptraffic_classes_10 cl  (cost=0.00..8.36 rows=1 width=18) (actual time=0.046..14.205 rows=3985 loops=1)"
"                    Output: cl.detectorid, cl.sessionid, cl.createdtime"
"                    Index Cond: ((cl.detectorid = 1) AND (cl.sessionid < 1318326120000000000::bigint) AND (cl.classid = 1))"
"              ->  Index Scan using icmptraffic_classes_11_pkey on appqosdata.icmptraffic_classes_11 cl  (cost=0.00..8.62 rows=1 width=18) (actual time=0.038..52.757 rows=14372 loops=1)"
"                    Output: cl.detectorid, cl.sessionid, cl.createdtime"
"                    Index Cond: ((cl.detectorid = 1) AND (cl.sessionid < 1318326120000000000::bigint) AND (cl.classid = 1))"
"              ->  Index Scan using icmptraffic_classes_12_pkey on appqosdata.icmptraffic_classes_12 cl  (cost=0.00..8.60 rows=1 width=18) (actual time=0.033..47.845 rows=13512 loops=1)"
"                    Output: cl.detectorid, cl.sessionid, cl.createdtime"
"                    Index Cond: ((cl.detectorid = 1) AND (cl.sessionid < 1318326120000000000::bigint) AND (cl.classid = 1))"
"              ->  Index Scan using icmptraffic_classes_13_pkey on appqosdata.icmptraffic_classes_13 cl  (cost=0.00..8.59 rows=1 width=18) (actual time=0.030..46.504 rows=13274 loops=1)"
"                    Output: cl.detectorid, cl.sessionid, cl.createdtime"
"                    Index Cond: ((cl.detectorid = 1) AND (cl.sessionid < 1318326120000000000::bigint) AND (cl.classid = 1))"
"              ->  Index Scan using icmptraffic_classes_14_pkey on appqosdata.icmptraffic_classes_14 cl  (cost=0.00..8.43 rows=1 width=18) (actual time=0.025..22.868 rows=6633 loops=1)"
"                    Output: cl.detectorid, cl.sessionid, cl.createdtime"
"                    Index Cond: ((cl.detectorid = 1) AND (cl.sessionid < 1318326120000000000::bigint) AND (cl.classid = 1))"
"              ->  Index Scan using icmptraffic_classes_15_pkey on appqosdata.icmptraffic_classes_15 cl  (cost=0.00..8.30 rows=1 width=18) (actual time=0.014..0.014 rows=0 loops=1)"
"                    Output: cl.detectorid, cl.sessionid, cl.createdtime"
"                    Index Cond: ((cl.detectorid = 1) AND (cl.sessionid < 1318326120000000000::bigint) AND (cl.classid = 1))"
"        ->  Materialize  (cost=0.00..65.13 rows=6 width=42) (actual time=0.001..73.261 rows=50915 loops=51776)"
"              Output: sd.detectorid, sd.createdtime, sd.datafromsource, sd.datafromdestination, sd.packetsfromsource, sd.packetsfromdestination, sd.sessionid"
"              ->  Append  (cost=0.00..65.10 rows=6 width=42) (actual time=0.059..244.693 rows=50915 loops=1)"
"                    ->  Seq Scan on appqosdata.icmptraffic sd  (cost=0.00..22.60 rows=1 width=42) (actual time=0.001..0.001 rows=0 loops=1)"
"                          Output: sd.detectorid, sd.createdtime, sd.datafromsource, sd.datafromdestination, sd.packetsfromsource, sd.packetsfromdestination, sd.sessionid"
"                          Filter: ((sd.detectorid >= 0) AND (sd.createdtime >= 1317726000000000000::bigint) AND (sd.createdtime < 1318326120000000000::bigint) AND (sd.detectorid = 1))"
"                    ->  Index Scan using icmptraffic_10_pkey on appqosdata.icmptraffic_10 sd  (cost=0.00..8.35 rows=1 width=42) (actual time=0.053..7.807 rows=3997 loops=1)"
"                          Output: sd.detectorid, sd.createdtime, sd.datafromsource, sd.datafromdestination, sd.packetsfromsource, sd.packetsfromdestination, sd.sessionid"
"                          Index Cond: ((sd.detectorid >= 0) AND (sd.detectorid = 1) AND (sd.createdtime >= 1317726000000000000::bigint) AND (sd.createdtime < 1318326120000000000::bigint))"
"                    ->  Index Scan using icmptraffic_11_pkey on appqosdata.icmptraffic_11 sd  (cost=0.00..8.59 rows=1 width=42) (actual time=0.025..27.957 rows=14372 loops=1)"
"                          Output: sd.detectorid, sd.createdtime, sd.datafromsource, sd.datafromdestination, sd.packetsfromsource, sd.packetsfromdestination, sd.sessionid"
"                          Index Cond: ((sd.detectorid >= 0) AND (sd.detectorid = 1) AND (sd.createdtime >= 1317726000000000000::bigint) AND (sd.createdtime < 1318326120000000000::bigint))"
"                    ->  Index Scan using icmptraffic_12_pkey on appqosdata.icmptraffic_12 sd  (cost=0.00..8.58 rows=1 width=42) (actual time=0.027..26.217 rows=13512 loops=1)"
"                          Output: sd.detectorid, sd.createdtime, sd.datafromsource, sd.datafromdestination, sd.packetsfromsource, sd.packetsfromdestination, sd.sessionid"
"                          Index Cond: ((sd.detectorid >= 0) AND (sd.detectorid = 1) AND (sd.createdtime >= 1317726000000000000::bigint) AND (sd.createdtime < 1318326120000000000::bigint))"
"                    ->  Index Scan using icmptraffic_13_pkey on appqosdata.icmptraffic_13 sd  (cost=0.00..8.56 rows=1 width=42) (actual time=0.030..26.075 rows=13430 loops=1)"
"                          Output: sd.detectorid, sd.createdtime, sd.datafromsource, sd.datafromdestination, sd.packetsfromsource, sd.packetsfromdestination, sd.sessionid"
"                          Index Cond: ((sd.detectorid >= 0) AND (sd.detectorid = 1) AND (sd.createdtime >= 1317726000000000000::bigint) AND (sd.createdtime < 1318326120000000000::bigint))"
"                    ->  Index Scan using icmptraffic_14_pkey on appqosdata.icmptraffic_14 sd  (cost=0.00..8.41 rows=1 width=42) (actual time=0.027..11.040 rows=5604 loops=1)"
"                          Output: sd.detectorid, sd.createdtime, sd.datafromsource, sd.datafromdestination, sd.packetsfromsource, sd.packetsfromdestination, sd.sessionid"
"                          Index Cond: ((sd.detectorid >= 0) AND (sd.detectorid = 1) AND (sd.createdtime >= 1317726000000000000::bigint) AND (sd.createdtime < 1318326120000000000::bigint))"
"Total runtime: 7594071.137 ms"



             name             |                                             current_setting                                             
------------------------------+----------------------------------------------------------------------------------------------------------
 version                      | PostgreSQL 9.0.3 on amd64-portbld-freebsd8.0, compiled by GCC cc (GCC) 4.2.1 20070719  [FreeBSD], 64-bit
 autovacuum                   | on
 autovacuum_analyze_threshold | 500000
 autovacuum_max_workers       | 1
 autovacuum_naptime           | 1h
 autovacuum_vacuum_threshold  | 500000
 checkpoint_segments          | 64
 effective_cache_size         | 3GB
 fsync                        | on
 lc_collate                   | C
 lc_ctype                     | C
 listen_addresses             | *
 log_destination              | syslog, stderr
 log_min_duration_statement   | 5ms
 log_rotation_age             | 1d
 log_rotation_size            | 100MB
 logging_collector            | on
 max_connections              | 30
 max_stack_depth              | 2MB
 server_encoding              | UTF8
 shared_buffers               | 1793MB
 silent_mode                  | on
 synchronous_commit           | on
 syslog_facility              | local0
 TimeZone                     | Europe/Jersey
 update_process_title         | off
 wal_buffers                  | 128kB
 work_mem                     | 24MB

Thanks for any help,
Svetlin Manavski

Re: Join over two tables of 50K records takes 2 hours

From
Tom Lane
Date:
Svetlin Manavski <svetlin.manavski@gmail.com> writes:
> I am running 9.03 with the settings listed below. I have a prohibitively
> slow query in an application which has an overall good performance:

It's slow because the planner is choosing a nestloop join on the
strength of its estimate that there's only a half dozen rows to be
joined.  You need to figure out why those rowcount estimates are so bad.
I suspect that you've shot yourself in the foot by raising
autovacuum_analyze_threshold so high --- most likely, none of those
tables have ever gotten analyzed.  And what's with the high
autovacuum_naptime setting?  You might need to increase
default_statistics_target too, but first see if a manual ANALYZE makes
things better.

            regards, tom lane

Re: Join over two tables of 50K records takes 2 hours

From
Scott Marlowe
Date:
It seems like your row estimate are way off, with the planner
expecting 1 and getting 3000 or so.  Have you tried cranking up
default stats target to say 1000, running analyze and seeing what
happens?

If that doesn't do it, try temporarily turning off nested loops:

set enable_nestloop = off;
explain analyze yourqueryhere;

Re: Join over two tables of 50K records takes 2 hours

From
Scott Marlowe
Date:
On Fri, Oct 14, 2011 at 2:35 AM, Svetlin Manavski
<svetlin.manavski@gmail.com> wrote:
> Thank you guys for spotting the problem immediately.
> The reason for such autovacuum thresholds is that these tables are designed
> for very high rate of inserts and I have a specific routine to analyze them
> in a more controlled way. Infact the stats target of some of the fields is
> also high. However that routine was failing to perform analyze on
> appqosdata.icmptraffic and its children due to another issue...

Note that it's possible to set many of the autovacuum values per table
so you don't have to do a brainyectomy on the whole autovac system.

Re: Join over two tables of 50K records takes 2 hours

From
Svetlin Manavski
Date:
Thank you guys for spotting the problem immediately.
The reason for such autovacuum thresholds is that these tables are designed for very high rate of inserts and I have a specific routine to analyze them in a more controlled way. Infact the stats target of some of the fields is also high. However that routine was failing to perform analyze on appqosdata.icmptraffic and its children due to another issue...

Regards,
Svetlin Manavski

On Fri, Oct 14, 2011 at 5:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Svetlin Manavski <svetlin.manavski@gmail.com> writes:
> I am running 9.03 with the settings listed below. I have a prohibitively
> slow query in an application which has an overall good performance:

It's slow because the planner is choosing a nestloop join on the
strength of its estimate that there's only a half dozen rows to be
joined.  You need to figure out why those rowcount estimates are so bad.
I suspect that you've shot yourself in the foot by raising
autovacuum_analyze_threshold so high --- most likely, none of those
tables have ever gotten analyzed.  And what's with the high
autovacuum_naptime setting?  You might need to increase
default_statistics_target too, but first see if a manual ANALYZE makes
things better.

                       regards, tom lane