Slow join on partitioned table - Mailing list pgsql-performance

From Mark Thornton
Subject Slow join on partitioned table
Date
Msg-id 4D70CFC7.5050805@optrak.co.uk
Whole thread Raw
Responses Re: Slow join on partitioned table  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-performance
The query plan appends sequential scans on the tables in the partition
(9 tables, ~4 million rows) and then hash joins that with a 14 row
table. The join condition is the primary key of each table in the
partition (and would be the primary key of the parent if that was
supported).
It would be much faster if it did an index scan on each of the child
tables and merged the results.

I can achieve this manually by rewriting the query as a union between
queries against each of the child tables. Is there a better way? (I'm
using PostGreSQL 8.4 with PostGIS 1.4).

Regards,
Mark Thornton

The query:

select rriID,ST_AsBinary(centreLine),ST_AsBinary(point)
         from "RoadLinkInformation" join LinkIds on
"RoadLinkInformation".rriID=LinkIds.featureid
         join "MasterRoadLinks" on
"RoadLinkInformation".roadLinkID="MasterRoadLinks".featureID

Table definitions

create temporary table LinkIds (featureid bigint not null)

create table "RoadLinkInformation" (
     rriid bigint not null primary key,
     roadlinkid bigint not null,
     point geometry,
     bound geometry
)

create table "MasterRoadLinks" (
             featureId bigint not null,
             centreLine geometry not null,
... other columns omitted for clarity
             )

All "RoadLinks/*" tables are children with the same structure and
featureId as the primary key.

The LinkIds table is the result of a previous query and contains just 14
rows (in this example).

Running the query against a view constructed as a union of the child
tables results in essentially the same query plan.

                                                                                       QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Hash Join  (cost=16097.75..191987.73 rows=33266 width=683) (actual
time=3003.302..5387.541 rows=14 loops=1)
    Hash Cond: (public."MasterRoadLinks".featureid =
"RoadLinkInformation".roadlinkid)
    ->  Append  (cost=0.00..159902.54 rows=4130254 width=583) (actual
time=2.357..4056.404 rows=4129424 loops=1)
          ->  Seq Scan on "MasterRoadLinks"  (cost=0.00..18.30 rows=830
width=40) (actual time=0.002..0.002 rows=0 loops=1)
          ->  Seq Scan on "RoadLinks/A Road" "MasterRoadLinks"
(cost=0.00..22531.32 rows=378732 width=519) (actual time=2.352..268.170
rows=378732 loops=1)
          ->  Seq Scan on "RoadLinks/B Road" "MasterRoadLinks"
(cost=0.00..6684.19 rows=182819 width=587) (actual time=0.008..114.671
rows=182819 loops=1)
          ->  Seq Scan on "RoadLinks/Alley" "MasterRoadLinks"
(cost=0.00..2973.31 rows=100731 width=353) (actual time=0.008..59.283
rows=100731 loops=1)
          ->  Seq Scan on "RoadLinks/Local Street" "MasterRoadLinks"
(cost=0.00..67255.79 rows=2063279 width=450) (actual
time=0.048..1281.454 rows=2063279 loops=1)
          ->  Seq Scan on "RoadLinks/Minor Road" "MasterRoadLinks"
(cost=0.00..30733.42 rows=722942 width=784) (actual time=0.047..517.770
rows=722942 loops=1)
          ->  Seq Scan on "RoadLinks/Motorway" "MasterRoadLinks"
(cost=0.00..683.03 rows=15403 width=820) (actual time=0.005..10.809
rows=15403 loops=1)
          ->  Seq Scan on "RoadLinks/Pedestrianised Street"
"MasterRoadLinks"  (cost=0.00..92.93 rows=2993 width=399) (actual
time=0.008..1.971 rows=2993 loops=1)
          ->  Seq Scan on "RoadLinks/Private Road - Publicly Accessible"
"MasterRoadLinks"  (cost=0.00..1187.79 rows=30579 width=662) (actual
time=0.006..21.177 rows=30579 loops=1)
          ->  Seq Scan on "RoadLinks/Private Road - Restricted Access"
"MasterRoadLinks"  (cost=0.00..27742.46 rows=631946 width=855) (actual
time=0.047..459.302 rows=631946 loops=1)
    ->  Hash  (cost=16071.00..16071.00 rows=2140 width=116) (actual
time=0.205..0.205 rows=14 loops=1)
          ->  Nested Loop  (cost=0.00..16071.00 rows=2140 width=116)
(actual time=0.045..0.183 rows=14 loops=1)
                ->  Seq Scan on linkids  (cost=0.00..31.40 rows=2140
width=8) (actual time=0.006..0.012 rows=14 loops=1)
                ->  Index Scan using "RoadLinkInformation_pkey" on
"RoadLinkInformation"  (cost=0.00..7.48 rows=1 width=116) (actual
time=0.008..0.009 rows=1 loops=14)
                      Index Cond: ("RoadLinkInformation".rriid =
linkids.featureid)
  Total runtime: 5387.734 ms
(19 rows)


pgsql-performance by date:

Previous
From: Vidhya Bondre
Date:
Subject: Re: Vacuum problem due to temp tables
Next
From: Robert Haas
Date:
Subject: Re: Vacuum problem due to temp tables