Thread: Two queries are better than one?
I ran into a situation today maintaining someone else's code where the sum time running 2 queries seems to be faster than 1. The original code was split into two queries. I thought about joining them, but considering the intelligence of my predecessor, I wanted to test it. The question is, which technique is really faster? Is there some hidden setup cost I don't see with explain analyze? Postgres 7.4.7, Redhat AES 3 Each query individually: test=> explain analyze test-> select * from order WHERE ord_batch='343B' AND ord_id='12-645'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Index Scan using order_pkey on order (cost=0.00..6.02 rows=1 width=486) (actual time=0.063..0.066 rows=1 loops=1) Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar)) Total runtime: 0.172 ms (3 rows) test=> explain analyze test-> select cli_name from client where cli_code='1837'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Index Scan using client_pkey on client (cost=0.00..5.98 rows=2 width=39) (actual time=0.043..0.047 rows=1 loops=1) Index Cond: (cli_code = '1837'::bpchar) Total runtime: 0.112 ms (3 rows) Joined: test=> explain analyze test-> SELECT cli_name,order.* test-> FROM order test-> JOIN client ON (ord_client = cli_code) test-> WHERE ord_batch='343B' AND ord_id='12-645'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..12.00 rows=2 width=525) (actual time=0.120..0.128 rows=1 loops=1) -> Index Scan using order_pkey on order (cost=0.00..6.02 rows=1 width=486) (actual time=0.064..0.066 rows=1 loops=1) Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar)) -> Index Scan using client_pkey on client (cost=0.00..5.98 rows=1 width=51) (actual time=0.023..0.026 rows=1 loops=1) Index Cond: ("outer".ord_client = client.cli_code) Total runtime: 0.328 ms (6 rows) -- Karim Nassar <karim.nassar@acm.org>
On Thu, Jul 28, 2005 at 04:04:25PM -0700, Karim Nassar wrote: > I ran into a situation today maintaining someone else's code where the > sum time running 2 queries seems to be faster than 1. The original code > was split into two queries. I thought about joining them, but > considering the intelligence of my predecessor, I wanted to test it. > > The question is, which technique is really faster? Is there some hidden > setup cost I don't see with explain analyze? To see which technique will be faster in your application, time the application code. The queries you show are taking fractions of a millisecond; the communications overhead of executing two queries might make that technique significantly slower than just the server execution time that EXPLAIN ANALYZE shows. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Karim Nassar wrote: > I ran into a situation today maintaining someone else's code where the > sum time running 2 queries seems to be faster than 1. The original code > was split into two queries. I thought about joining them, but > considering the intelligence of my predecessor, I wanted to test it. > > The question is, which technique is really faster? Is there some hidden > setup cost I don't see with explain analyze? Yes, the time it takes your user code to parse the result, and create the new query. :) It does seem like you are taking an extra 0.1ms for the combined query, but that means you don't have another round trip to the database. So that would mean one less context switch, and you don't need to know what the cli_code is before you can get the cli_name. I would guess the overhead is the time for postgres to parse out the text, place another index query, and then combine the rows. It seems like this shouldn't take 0.1ms, but then again, that isn't very long. Also, did you run it *lots* of times to make sure that this isn't just noise? John =:-> > > Postgres 7.4.7, Redhat AES 3 > > Each query individually: > > test=> explain analyze > test-> select * from order WHERE ord_batch='343B' AND ord_id='12-645'; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------- > Index Scan using order_pkey on order (cost=0.00..6.02 rows=1 width=486) (actual time=0.063..0.066 rows=1 loops=1) > Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar)) > Total runtime: 0.172 ms > (3 rows) > > > test=> explain analyze > test-> select cli_name from client where cli_code='1837'; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------- > Index Scan using client_pkey on client (cost=0.00..5.98 rows=2 width=39) (actual time=0.043..0.047 rows=1 loops=1) > Index Cond: (cli_code = '1837'::bpchar) > Total runtime: 0.112 ms > (3 rows) > > Joined: > > test=> explain analyze > test-> SELECT cli_name,order.* > test-> FROM order > test-> JOIN client ON (ord_client = cli_code) > test-> WHERE ord_batch='343B' AND ord_id='12-645'; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------------- > Nested Loop (cost=0.00..12.00 rows=2 width=525) (actual time=0.120..0.128 rows=1 loops=1) > -> Index Scan using order_pkey on order (cost=0.00..6.02 rows=1 width=486) (actual time=0.064..0.066 rows=1 loops=1) > Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar)) > -> Index Scan using client_pkey on client (cost=0.00..5.98 rows=1 width=51) (actual time=0.023..0.026 rows=1 loops=1) > Index Cond: ("outer".ord_client = client.cli_code) > Total runtime: 0.328 ms > (6 rows) > >
Attachment
On Thu, 2005-07-28 at 21:01 -0500, John A Meinel wrote: > > Also, did you run it *lots* of times to make sure that this isn't just > noise? If a dozen is lots, yes. :-) It was very consistent as I repeatedly ran it. -- Karim Nassar <karim.nassar@acm.org>
On 7/29/05, Karim Nassar <karim.nassar@acm.org> wrote:
where is the cli_code condition in the above query?
I ran into a situation today maintaining someone else's code where the
sum time running 2 queries seems to be faster than 1. The original code
was split into two queries. I thought about joining them, but
considering the intelligence of my predecessor, I wanted to test it.
The question is, which technique is really faster? Is there some hidden
setup cost I don't see with explain analyze?
Postgres 7.4.7, Redhat AES 3
Each query individually:
test=> explain analyze
test-> select * from order WHERE ord_batch='343B' AND ord_id='12-645';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Index Scan using order_pkey on order (cost=0.00..6.02 rows=1 width=486) (actual time= 0.063..0.066 rows=1 loops=1)
Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar))
Total runtime: 0.172 ms
(3 rows)
test=> explain analyze
test-> select cli_name from client where cli_code='1837';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Index Scan using client_pkey on client (cost=0.00..5.98 rows=2 width=39) (actual time= 0.043..0.047 rows=1 loops=1)
Index Cond: (cli_code = '1837'::bpchar)
Total runtime: 0.112 ms
(3 rows)
Joined:
test=> explain analyze
test-> SELECT cli_name,order.*
test-> FROM order
test-> JOIN client ON (ord_client = cli_code)
test-> WHERE ord_batch='343B' AND ord_id='12-645';
where is the cli_code condition in the above query?
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..12.00 rows=2 width=525) (actual time=0.120..0.128 rows=1 loops=1)
-> Index Scan using order_pkey on order (cost=0.00..6.02 rows=1 width=486) (actual time=0.064..0.066 rows=1 loops=1)
Index Cond: ((ord_batch = '343B'::bpchar) AND (ord_id = '12-645'::bpchar))
-> Index Scan using client_pkey on client (cost=0.00..5.98 rows=1 width=51) (actual time=0.023..0.026 rows=1 loops=1)
Index Cond: ("outer".ord_client = client.cli_code)
Total runtime: 0.328 ms
(6 rows)
--
Karim Nassar <karim.nassar@acm.org >
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
--
with regards,
S.Gnanavel
Satyam Computer Services Ltd.
On Fri, 2005-07-29 at 09:41 +0530, Gnanavel S wrote: > > Joined: > > test=> explain analyze > test-> SELECT cli_name,order.* > test-> FROM order > test-> JOIN client ON (ord_client = cli_code) > test-> WHERE ord_batch='343B' AND > ord_id='12-645'; > > where is the cli_code condition in the above query? I don't understand the question. ord_client is the client code, and cli_code is the client code, for their respective tables. batch/id is unique, so there is only one record from order, and only one client to associate. Clearer? -- Karim Nassar <karim.nassar@acm.org>
On 7/29/05, Karim Nassar <karim.nassar@acm.org> wrote:
ok.
Reason might be comparing with a literal value (previous case) is cheaper than comparing with column(as it has to be evaluated). But with the previous case getting and assigning the cli_code in the application and executing in db will be time consuming as it includes IPC cost.
On Fri, 2005-07-29 at 09:41 +0530, Gnanavel S wrote:
>
> Joined:
>
> test=> explain analyze
> test-> SELECT cli_name,order.*
> test-> FROM order
> test-> JOIN client ON (ord_client = cli_code)
> test-> WHERE ord_batch='343B' AND
> ord_id='12-645';
>
> where is the cli_code condition in the above query?
I don't understand the question. ord_client is the client code, and
cli_code is the client code, for their respective tables. batch/id is
unique, so there is only one record from order, and only one client to
associate.
Clearer?
ok.
Reason might be comparing with a literal value (previous case) is cheaper than comparing with column(as it has to be evaluated). But with the previous case getting and assigning the cli_code in the application and executing in db will be time consuming as it includes IPC cost.
--
Karim Nassar <karim.nassar@acm.org >
--
with regards,
S.Gnanavel
Satyam Computer Services Ltd.