Efficiency Question - Mailing list pgsql-sql

From Colin Fox
Subject Efficiency Question
Date
Msg-id pan.2002.09.09.00.12.58.87388@cfconsulting.ca
Whole thread Raw
Responses Re: Efficiency Question  (Bruno Wolff III <bruno@wolff.to>)
List pgsql-sql
Hi, all.

I'm putting together a small query that should return the most recent
entry from a table by date (I can't use an ID field as new entries may be
added with older dates). It's not crucial that this run at 100%
efficiency, but I'm interested in the results and/or discussion, as this
will probably relate to other larger queries later.

Two methods occur to me, and I don't have a large enough data set to get
any kind of idea of the actual response, and I'm not sure I understand the
explain plan.

Method 1 is like this:

select   *
from   motm
where   creation_date = (select max(creation_date) from motm);

Which requires a subselect and a max operator on a date field. Method two
is:

select  *
from  motm
order by  creation_date desc
limit 1;

So in the first case I select the record that has the largest date. In the
second case, I order all the records, and then return only one.

Here's the explain for both:

----------
Method 1:
NOTICE:  QUERY PLAN:

Merge Join  (cost=23.77..23.96 rows=1 width=60) InitPlan   ->  Aggregate  (cost=22.50..22.50 rows=1 width=8)         ->
Seq Scan on motm  (cost=0.00..20.00 rows=1000 width=8) ->  Sort  (cost=22.67..22.67 rows=10 width=20)       ->  Seq
Scanon motm m  (cost=0.00..22.50 rows=10 width=20) ->  Sort  (cost=1.11..1.11 rows=5 width=40)       ->  Seq Scan on
peoplep  (cost=0.00..1.05 rows=5 width=40)
 

EXPLAIN
----------
Method 2:
NOTICE:  QUERY PLAN:

Limit  (cost=84.91..84.91 rows=1 width=68) ->  Sort  (cost=84.91..84.91 rows=50 width=68)       ->  Merge Join
(cost=70.94..83.50rows=50 width=68)             ->  Sort  (cost=69.83..69.83 rows=1000 width=28)                   ->
SeqScan on motm m  (cost=0.00..20.00 rows=1000 width=28)             ->  Sort  (cost=1.11..1.11 rows=5 width=40)
          ->  Seq Scan on people p  (cost=0.00..1.05 rows=5 width=40)
 

EXPLAIN
--------------
According to the cost score, it seems that method 1 is faster, almost 4x!
Is that actually the case?

Opinions welcome. :)

Colin


pgsql-sql by date:

Previous
From: jlparkinson@bigpond.com
Date:
Subject: Slow Multi-joins performance
Next
From: "Jay"
Date:
Subject: Database joins