Thread: Q: Performance of join vs embedded query for simple queries?

Q: Performance of join vs embedded query for simple queries?

From
mark@mark.mielke.cc
Date:
I have two simple queries that do what I believe to be the exact same
thing. I was surprised to see a reliable, and what I consider to be
significant (although not problematic for my application) difference
in execution time. It hints to me that PostgreSQL may be missing an
optimization opportunity? This is on PostgreSQL 8.1.4.

For a quick summary of the relationships:

I have a 79 row "system" table that describes each ClearCase system.
ClearCase uses uuid to uniquely identify database objects across the
life of the object. For this table, I store uid as a varchar(80), and
have a unique index on it:

eudb=> \d sm_system
                                        Table "public.sm_system"
   Column    |          Type          |                            Modifiers
-------------+------------------------+-----------------------------------------------------------------
 system_dbid | integer                | not null default nextval('sm_system_system_dbid_seq'::regclass)
 type        | character varying(10)  | not null
 uid         | character varying(200) | not null
 name        | character varying(200) | not null
 owner       | character varying(80)  | not null
Indexes:
    "sm_system_pkey" PRIMARY KEY, btree (system_dbid) CLUSTER
    "sm_system_type_key" UNIQUE, btree ("type", uid)
Check constraints:
    "sm_system_type_check" CHECK ("type"::text = 'NEU'::text OR "type"::text = 'PLS'::text)

I have a 339,586 row "change" table that describes each ClearCase
activity. Each activity has a name that should be unique, but may not
be unique across time. Uniqueness is relative to the system that
contains it.

                                             Table "public.sm_change"
     Column     |              Type              |                            Modifiers
----------------+--------------------------------+-----------------------------------------------------------------
 change_dbid    | integer                        | not null default nextval('sm_change_change_dbid_seq'::regclass)
 system_dbid    | integer                        | not null
 stream_dbid    | integer                        | not null
 uid            | character varying(200)         | not null
 name           | character varying(200)         | not null
 status         | character varying(20)          | not null
 owner          | character varying(80)          | not null
 target         | integer                        |
 creationtime   | timestamp(0) without time zone | not null
 submissiontime | timestamp(0) without time zone | not null
 comments       | text                           |
 elements       | text                           |
Indexes:
    "sm_change_pkey" PRIMARY KEY, btree (change_dbid) CLUSTER
    "sm_change_system_dbid_key" UNIQUE, btree (system_dbid, uid)
    "sm_change_name_key" btree (lower(name::text))
    "sm_change_stream_dbid_key" btree (stream_dbid)
    "sm_change_target_key" btree (target)
Foreign-key constraints:
    "sm_change_stream_dbid_fkey" FOREIGN KEY (stream_dbid) REFERENCES sm_stream(stream_dbid)
    "sm_change_system_dbid_fkey" FOREIGN KEY (system_dbid) REFERENCES sm_system(system_dbid)
    "sm_change_target_fkey" FOREIGN KEY (target) REFERENCES sm_change(change_dbid)


One of the made up queries that I played with was a lookup on the system uuid, and the
activity name. This is the one that I noticed the timing difference:

neudb=> select uid, name from sm_change where system_dbid = (select system_dbid from sm_system where uid =
'2ff5942c.dd2911d5.ad56.08:00:09:fd:1b:da')and lower(name) = lower('markm-Q00855572'); 
                   uid                    |      name
------------------------------------------+-----------------
 ff733174.6c7411d8.900c.00:06:5b:b3:db:28 | markm-Q00855572
(1 row)

Time: 1.242 ms


The 1.242 ms is pretty stable. 1.226 ms -> 1.248 ms over 5 runs.

Then we have:

neudb=> select sm_change.uid, sm_change.name from sm_change join sm_system using (system_dbid) where sm_system.uid =
'2ff5942c.dd2911d5.ad56.08:00:09:fd:1b:da'and lower(sm_change.name) = lower('markm-Q00855572'); 
                   uid                    |      name
------------------------------------------+-----------------
 ff733174.6c7411d8.900c.00:06:5b:b3:db:28 | markm-Q00855572
(1 row)

Time: 1.500 ms


This time is less stable - it runs from 1.394 ms -> 1.561 ms over 5 runs.

As I mentioned - for my application, I don't really care. If it took
10 ms or more, I wouldn't care. But the difference in time bothered me.
So, here are the query plans that PostgreSQL selected for me:


neudb=> explain analyze select uid, name from sm_change where system_dbid = (select system_dbid from sm_system where
uid= '2ff5942c.dd2911d5.ad56.08:00:09:fd:1b:da') and lower(name) = lower('markm-Q00855572'); 
                                                          QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------
 Index Scan using sm_change_name_key on sm_change  (cost=2.99..7.82 rows=1 width=80) (actual time=0.322..0.328 rows=1
loops=1)
   Index Cond: (lower((name)::text) = 'markm-q00855572'::text)
   Filter: (system_dbid = $0)
   InitPlan
     ->  Seq Scan on sm_system  (cost=0.00..2.99 rows=1 width=4) (actual time=0.052..0.106 rows=1 loops=1)
           Filter: ((uid)::text = '2ff5942c.dd2911d5.ad56.08:00:09:fd:1b:da'::text)
 Total runtime: 0.419 ms
(7 rows)

Time: 16.494 ms


neudb=> explain analyze select sm_change.uid, sm_change.name from sm_change join sm_system using (system_dbid) where
sm_system.uid= '2ff5942c.dd2911d5.ad56.08:00:09:fd:1b:da' and lower(sm_change.name) = lower('markm-Q00855572'); 
                                                             QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..7.83 rows=1 width=80) (actual time=0.099..0.159 rows=1 loops=1)
   Join Filter: ("outer".system_dbid = "inner".system_dbid)
   ->  Index Scan using sm_change_name_key on sm_change  (cost=0.00..4.83 rows=1 width=84) (actual time=0.053..0.059
rows=1loops=1) 
         Index Cond: (lower((name)::text) = 'markm-q00855572'::text)
   ->  Seq Scan on sm_system  (cost=0.00..2.99 rows=1 width=4) (actual time=0.030..0.077 rows=1 loops=1)
         Filter: ((uid)::text = '2ff5942c.dd2911d5.ad56.08:00:09:fd:1b:da'::text)
 Total runtime: 0.250 ms
(7 rows)

Time: 1.898 ms


I'm still learning how PostgreSQL works internally. My understanding
is that the above are essentially the same. The first finds the system
row using a sequential scan, then looks for the change row using the
index, filtering by the system value. The second finds the change rows
using the same index, expecting to find one row, and finding only one
row, and matches it up against the system row using a sequential scan.

So why does one reliably run faster than the other?

neudb=> prepare plan1 (varchar(80), varchar(80)) as select uid, name from sm_change where system_dbid = (select
system_dbidfrom sm_system where uid = $1) and lower(name) = lower($2); 

neudb=> prepare plan2 (varchar(80), varchar(80)) as select sm_change.uid, sm_change.name from sm_change join sm_system
using(system_dbid) where sm_system.uid = $1 and lower(sm_change.name) = lower($2); 

Now:

neudb=> execute plan1 ('2ff5942c.dd2911d5.ad56.08:00:09:fd:1b:da', 'markm-q00855572');
                   uid                    |      name
------------------------------------------+-----------------
 ff733174.6c7411d8.900c.00:06:5b:b3:db:28 | markm-Q00855572
(1 row)

Time: 0.794 ms


neudb=> execute plan2 ('2ff5942c.dd2911d5.ad56.08:00:09:fd:1b:da', 'markm-q00855572');
                   uid                    |      name
------------------------------------------+-----------------
 ff733174.6c7411d8.900c.00:06:5b:b3:db:28 | markm-Q00855572
(1 row)

Time: 0.715 ms


The numbers above don't mean anything. I ran both a few dozen times, and my conclusion
is that after the plan is prepared (I did explain analyze to ensure that the prepared
plans were the same as the dynamically generated plans), the times are the same. Both
ranged from 0.690 ms -> 0.850 ms. Timings at these resolutions are not so reliable. :-)

I think this means that the planner takes longer to figure out what to do about the
join, and that my writing the select out as an embedded select reduces the effort
required by the planner. This makes sense to me, except that I thought PostgreSQL
would convert back and forth between the two forms automatically. They are the same
query, are they not? Why wouldn't they both take longer, or both take shorter? What
if I invented a scenario where the difference in plans made a major difference,
such as making the system table much larger, still without an index? Should they
not both come up with the same plan - the better estimated plan?

Am I expecting too much? :-)

Cheers,
mark

--
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   |
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


Re: Q: Performance of join vs embedded query for simple queries?

From
Tom Lane
Date:
mark@mark.mielke.cc writes:
> I have two simple queries that do what I believe to be the exact same
> thing.

These are actually not equivalent per spec.

> neudb=> select uid, name from sm_change where system_dbid = (select system_dbid from sm_system where uid =
'2ff5942c.dd2911d5.ad56.08:00:09:fd:1b:da')and lower(name) = lower('markm-Q00855572'); 

> neudb=> select sm_change.uid, sm_change.name from sm_change join sm_system using (system_dbid) where sm_system.uid =
'2ff5942c.dd2911d5.ad56.08:00:09:fd:1b:da'and lower(sm_change.name) = lower('markm-Q00855572'); 

The subselect form constrains the sub-select to return at most one row
--- you'd have gotten an error if there were more than one sm_system row
with that uid.  The join form does not make this constraint.

Another related form is

neudb=> select uid, name from sm_change where system_dbid IN (select system_dbid from sm_system where uid =
'2ff5942c.dd2911d5.ad56.08:00:09:fd:1b:da')and lower(name) = lower('markm-Q00855572'); 

This still isn't equivalent to the join: it'll return at most one copy
of any sm_change row, whereas you can get multiple copies of the same
sm_change row from the join, if there were multiple matching sm_system
rows.  (Hm, given the unique index on (system_dbid, uid), I guess that
couldn't actually happen --- but you have to reason about it knowing
that that index is there, it's not obvious from the form of the query.)

Anyway: given the way that the planner works, the IN form and the join
form will probably take comparable amounts of time to plan.  The "=
subselect" form is much more constrained in terms of the number of
alternative implementations we have, so it doesn't surprise me that it
takes less time to plan.

            regards, tom lane

Re: Q: Performance of join vs embedded query for simple queries?

From
mark@mark.mielke.cc
Date:
On Thu, Aug 17, 2006 at 09:21:33PM -0400, Tom Lane wrote:
> mark@mark.mielke.cc writes:
> > I have two simple queries that do what I believe to be the exact same
> > thing.
> These are actually not equivalent per spec.
> ...
> This still isn't equivalent to the join: it'll return at most one copy
> of any sm_change row, whereas you can get multiple copies of the same
> sm_change row from the join, if there were multiple matching sm_system
> rows.  (Hm, given the unique index on (system_dbid, uid), I guess that
> couldn't actually happen --- but you have to reason about it knowing
> that that index is there, it's not obvious from the form of the query.)

> Anyway: given the way that the planner works, the IN form and the join
> form will probably take comparable amounts of time to plan.  The "=
> subselect" form is much more constrained in terms of the number of
> alternative implementations we have, so it doesn't surprise me that it
> takes less time to plan.

That makes sense. Would it be reasonable for the planner to eliminate
plan considerations based on the existence of unique indexes, or is
this a fundamentally difficult thing to get right in the general case?

I did the elimination in my head, which is why I considered the plans to
be the same. Can the planner do it?

Sub-millisecond planning/execution for simple queries on moderate
hardware seems sexy... :-)

Thanks,
mark

--
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   |
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


Re: Q: Performance of join vs embedded query for simple queries?

From
mark@mark.mielke.cc
Date:
On Thu, Aug 17, 2006 at 09:21:33PM -0400, Tom Lane wrote:
> Another related form is
>
> neudb=> select uid, name from sm_change where system_dbid IN (select system_dbid from sm_system where uid =
'2ff5942c.dd2911d5.ad56.08:00:09:fd:1b:da')and lower(name) = lower('markm-Q00855572'); 
> ...
> Anyway: given the way that the planner works, the IN form and the join
> form will probably take comparable amounts of time to plan.  The "=
> subselect" form is much more constrained in terms of the number of
> alternative implementations we have, so it doesn't surprise me that it
> takes less time to plan.

FYI: You are correct. The IN takes about as long as the join to plan,
and does generate the same plan as the join. This restores confidence
for me that PostgreSQL is able to understand the two as equivalent.

With regard to that unique constraint planning - I gave you the wrong
query from my log. I had already thought that through, and realized
that my original query missed the type. The timings and plans are the
functionally the same for all the three queries, with or without the
type qualifier. This is the table:

                                        Table "public.sm_system"
   Column    |          Type          |                            Modifiers
-------------+------------------------+-----------------------------------------------------------------
 system_dbid | integer                | not null default nextval('sm_system_system_dbid_seq'::regclass)
 type        | character varying(10)  | not null
 uid         | character varying(200) | not null
 name        | character varying(200) | not null
 owner       | character varying(80)  | not null
Indexes:
    "sm_system_pkey" PRIMARY KEY, btree (system_dbid) CLUSTER
    "sm_system_type_key" UNIQUE, btree ("type", uid)
Check constraints:
    "sm_system_type_check" CHECK ("type"::text = 'NEU'::text OR "type"::text = 'PLS'::text)

And this is what the query should have been:

neudb=> explain analyze select uid, name from sm_change where system_dbid IN (select system_dbid from sm_system where
type= 'NEU' and uid = '2ff5942c.dd2911d5.ad56.08:00:09:fd:1b:da') and lower(name) = lower('markm-Q00855572'); 
                                                             QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop IN Join  (cost=0.00..7.86 rows=1 width=80) (actual time=19.438..19.453 rows=1 loops=1)
   ->  Index Scan using sm_change_name_key on sm_change  (cost=0.00..4.83 rows=1 width=84) (actual time=0.064..0.073
rows=1loops=1) 
         Index Cond: (lower((name)::text) = 'markm-q00855572'::text)
   ->  Index Scan using sm_system_pkey on sm_system  (cost=0.00..3.02 rows=1 width=4) (actual time=19.358..19.358
rows=1loops=1) 
         Index Cond: ("outer".system_dbid = sm_system.system_dbid)
         Filter: ((("type")::text = 'NEU'::text) AND ((uid)::text = '2ff5942c.dd2911d5.ad56.08:00:09:fd:1b:da'::text))
 Total runtime: 19.568 ms
(7 rows)

Time: 21.449 ms


I guess the case isn't as simple as I thought. It would need to recognize
that the specification of both the 'type' and the 'uid' are static, and
unique, therefore the argument to the IN, or the table that it is joining
with will be either 0 rows or 1 row. Too complicated to be worth it, eh? :-)

Cheers,
mark

--
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   |
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


Re: Q: Performance of join vs embedded query for simple queries?

From
Tom Lane
Date:
mark@mark.mielke.cc writes:
> That makes sense. Would it be reasonable for the planner to eliminate
> plan considerations based on the existence of unique indexes, or is
> this a fundamentally difficult thing to get right in the general case?

The big obstacle to that at the moment is that we don't have any plan
cache invalidation mechanism; so a plan that depended for correctness on
the existence of a unique index might silently give wrong results after
someone drops the index and inserts non-unique values into the table.
(If the plan actually *uses* the index, then you'd at least get an
access failure ... but if the index was merely used to make an
assumption at plan time, you wouldn't.)

The present "constraint_exclusion" mechanism will actually fail in
exactly this kind of scenario, which is why I insisted it be off by
default :-(

This has been on the radar screen for awhile.  I'd hoped we'd get a
plan invalidation mechanism in place for 8.2, but seems that's not
happening.  Eventually it'll be there, though, and then we can get
more aggressive about making deductions based on constraints.

            regards, tom lane