Thread: simple query join
Hi all,
I've got what should be a relatively simple join between two tables that is taking forever and I can't work out why.
Version 7.3.4RH.
It can't be upgraded because the system is kept in sync with RedHat Enterprise (using up2date). Not my system otherwise I'd do that :(
Database has been 'vacuum analyze'd.
blah=> \d sq_asset;
Table "public.sq_asset"
Column | Type | Modifiers
----------------+-----------------------------+--------------------------
type_code | character varying(100) | not null
version | character varying(20) | not null default '0.0.0'
name | character varying(255) | not null default ''
short_name | character varying(255) | not null default ''
status | integer | not null default 1
languages | character varying(50) | not null default ''
charset | character varying(50) | not null default ''
force_secure | character(1) | not null default '0'
created | timestamp without time zone | not null
updated | timestamp without time zone | not null
created_userid | character varying(255) | not null default '0'
updated_userid | character varying(255) | not null default '0'
assetid | integer | not null default 0
Indexes: sq_asset_pkey primary key btree (assetid)
Table "public.sq_asset"
Column | Type | Modifiers
----------------+-----------------------------+--------------------------
type_code | character varying(100) | not null
version | character varying(20) | not null default '0.0.0'
name | character varying(255) | not null default ''
short_name | character varying(255) | not null default ''
status | integer | not null default 1
languages | character varying(50) | not null default ''
charset | character varying(50) | not null default ''
force_secure | character(1) | not null default '0'
created | timestamp without time zone | not null
updated | timestamp without time zone | not null
created_userid | character varying(255) | not null default '0'
updated_userid | character varying(255) | not null default '0'
assetid | integer | not null default 0
Indexes: sq_asset_pkey primary key btree (assetid)
blah=> select count(*) from sq_asset;
count
-------
16467
(1 row)
count
-------
16467
(1 row)
blah=> \d sq_asset_permission;
Table "public.sq_asset_permission"
Column | Type | Modifiers
------------+------------------------+----------------------
permission | integer | not null default 0
access | character(1) | not null default '0'
assetid | character varying(255) | not null default '0'
userid | character varying(255) | not null default '0'
Indexes: sq_asset_permission_pkey primary key btree (assetid, userid, permission)
"sq_asset_permission_access" btree ("access")
"sq_asset_permission_assetid" btree (assetid)
"sq_asset_permission_permission" btree (permission)
"sq_asset_permission_userid" btree (userid)
Table "public.sq_asset_permission"
Column | Type | Modifiers
------------+------------------------+----------------------
permission | integer | not null default 0
access | character(1) | not null default '0'
assetid | character varying(255) | not null default '0'
userid | character varying(255) | not null default '0'
Indexes: sq_asset_permission_pkey primary key btree (assetid, userid, permission)
"sq_asset_permission_access" btree ("access")
"sq_asset_permission_assetid" btree (assetid)
"sq_asset_permission_permission" btree (permission)
"sq_asset_permission_userid" btree (userid)
blah=> select count(*) from sq_asset_permission;
count
-------
73715
(1 row)
count
-------
73715
(1 row)
EXPLAIN ANALYZE SELECT p.*
FROM sq_asset a, sq_asset_permission p
WHERE a.assetid = p.assetid
AND p.permission = '1'
AND p.access = '1'
AND p.userid = '0';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..4743553.10 rows=2582 width=27) (actual time=237.91..759310.60 rows=11393 loops=1)
Join Filter: (("inner".assetid)::text = ("outer".assetid)::text)
-> Seq Scan on sq_asset_permission p (cost=0.00..1852.01 rows=2288 width=23) (actual time=0.06..196.90 rows=12873 loops=1)
Filter: ((permission = 1) AND ("access" = '1'::bpchar) AND (userid = '0'::character varying))
-> Seq Scan on sq_asset a (cost=0.00..1825.67 rows=16467 width=4) (actual time=1.40..29.09 rows=16467 loops=12873)
Total runtime: 759331.85 msec
(6 rows)
FROM sq_asset a, sq_asset_permission p
WHERE a.assetid = p.assetid
AND p.permission = '1'
AND p.access = '1'
AND p.userid = '0';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..4743553.10 rows=2582 width=27) (actual time=237.91..759310.60 rows=11393 loops=1)
Join Filter: (("inner".assetid)::text = ("outer".assetid)::text)
-> Seq Scan on sq_asset_permission p (cost=0.00..1852.01 rows=2288 width=23) (actual time=0.06..196.90 rows=12873 loops=1)
Filter: ((permission = 1) AND ("access" = '1'::bpchar) AND (userid = '0'::character varying))
-> Seq Scan on sq_asset a (cost=0.00..1825.67 rows=16467 width=4) (actual time=1.40..29.09 rows=16467 loops=12873)
Total runtime: 759331.85 msec
(6 rows)
It's a straight join so I can't see why it would be this slow.. The tables are pretty small too.
Thanks for any suggestions :)
Chris.
Looks to me like it's because your assetid is varchar in one table and an integer in the other table. AFAIK, PG is unable to use an index join when the join types are different. The query plan shows it is doing full table scans of both tables.
Change both to varchar or both to integer and see what happens.
Also make sure to vacuum analyze the tables regularly to keep the query planner statistics up-to-date.
Cheers,
Steve Butler
assetid | integer | not null default 0
Indexes: sq_asset_pkey primary key btree (assetid)assetid | character varying(255) | not null default '0'
EXPLAIN ANALYZE SELECT p.*
FROM sq_asset a, sq_asset_permission p
WHERE a.assetid = p.assetid
AND p.permission = '1'
AND p.access = '1'
AND p.userid = '0';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..4743553.10 rows=2582 width=27) (actual time=237.91..759310.60 rows=11393 loops=1)
Join Filter: (("inner".assetid)::text = ("outer".assetid)::text)
-> Seq Scan on sq_asset_permission p (cost=0.00..1852.01 rows=2288 width=23) (actual time=0.06..196.90 rows=12873 loops=1)
Filter: ((permission = 1) AND ("access" = '1'::bpchar) AND (userid = '0'::character varying))
-> Seq Scan on sq_asset a (cost=0.00..1825.67 rows=16467 width=4) (actual time=1.40..29.09 rows=16467 loops=12873)
Total runtime: 759331.85 msec
(6 rows)
On Mon, 8 Mar 2004, Chris Smith wrote: > assetid | integer | not null default 0 > assetid | character varying(255) | not null default '0' The types above does not match, and these are the attributes you use to join. -- /Dennis Björklund
Eek. Casting both to varchar makes it super quick so I'll fix up the tables.
Added to the list of things to check for next time...
On a side note - I tried it with 7.4.1 on another box and it handled it ok.
Thanks again :)
Chris.
-----Original Message-----
From: Steven Butler [mailto:stevenb@kjross.com.au]
Sent: Monday, March 08, 2004 6:12 PM
To: Chris Smith; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] simple query joinLooks to me like it's because your assetid is varchar in one table and an integer in the other table. AFAIK, PG is unable to use an index join when the join types are different. The query plan shows it is doing full table scans of both tables.Change both to varchar or both to integer and see what happens.Also make sure to vacuum analyze the tables regularly to keep the query planner statistics up-to-date.Cheers,Steve Butlerassetid | integer | not null default 0
Indexes: sq_asset_pkey primary key btree (assetid)assetid | character varying(255) | not null default '0'
EXPLAIN ANALYZE SELECT p.*
FROM sq_asset a, sq_asset_permission p
WHERE a.assetid = p.assetid
AND p.permission = '1'
AND p.access = '1'
AND p.userid = '0';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..4743553.10 rows=2582 width=27) (actual time=237.91..759310.60 rows=11393 loops=1)
Join Filter: (("inner".assetid)::text = ("outer".assetid)::text)
-> Seq Scan on sq_asset_permission p (cost=0.00..1852.01 rows=2288 width=23) (actual time=0.06..196.90 rows=12873 loops=1)
Filter: ((permission = 1) AND ("access" = '1'::bpchar) AND (userid = '0'::character varying))
-> Seq Scan on sq_asset a (cost=0.00..1825.67 rows=16467 width=4) (actual time=1.40..29.09 rows=16467 loops=12873)
Total runtime: 759331.85 msec
(6 rows)