Re: Creating tons of tables to support a query - Mailing list pgsql-general

From Jan Ploski
Subject Re: Creating tons of tables to support a query
Date
Msg-id 14629117.1031590075510.JavaMail.jpl@remotejava
Whole thread Raw
In response to Creating tons of tables to support a query  (Jan Ploski <jpljpl@gmx.de>)
Responses Re: Creating tons of tables to support a query  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Mon, Sep 09, 2002 at 11:41:38AM -0400, Tom Lane wrote:
> Yes, it makes sense that for a little-used section that way wouldn't be
> very efficient.  I would suggest that you want to use an index on
> (sectionID, dateSent), and that the way to make the system do the
> right thing is
>
> select msgnum from message where
> sectionID = ? and
> dateSent > ?
> order by sectionID, dateSent
> limit 1;
>
> Without the extra ORDER BY clause, the planner is not smart enough to
> see that the requested ordering actually matches the index ordering.

Tom,

thanks you for advice. Now I get performance comparable to that
when using a partial index with "workarounds", i.e.

select msgnum from message where
    sectionID=9 and
    dateSent>'2000-11-12 02:05:35.94'
    order by sectionID,dateSent limit 1;

works equally fast with an index on message(sectionID, dateSent)
as through a partial index on message where sectionID = 9.

When executed on the empty section 241, the statement is way
faster than the partial index based solution (0.67 seconds vs 8.5 seconds),
probably because I had to resort to trickery to make it work
reasonably at all.

What I still cannot grasp is why

select msgnum into v_cnt from message where sectionID = 241
    order by dateSent desc limit 1;

is so much faster than

v_sid := 241;
select msgnum into v_cnt from message where sectionID = v_sid
    order by dateSent desc limit 1;

as I mentioned in an earlier message. In fact, to get the partial index
solution up to decent performance, I had to write something like

delete from tmp;
execute
    ''insert into tmp select msgnum into v_cnt from message where sectionID = '' ||
    v_sid || '' oder by dateSent desc limit 1'';
select max(cnt) into v_cnt from tmp;

Following Richard's suggestion, I turned query debugging on in
postgresql.conf. What jumps into my eyes in the log is

Shared blocks:        132 read      (sectionID = 241)
vs
Shared blocks:        1517 read     (sectionID = v_sid)

in postgres usage stats output after executing the query.

When I wrap the select statement into the awkward execute,
I get "152 read", which is still much, much better than "1517 read",
if that (apart from execution time) can be taken as an indicator
for performance. Do you have any clues?

> Another possible gotcha is that depending on datatype details the
> planner might be using only one of the two index columns.  As far
> as I noticed, you didn't tell us the exact column datatypes or the
> exact form in which the comparison values are supplied?

The column types are integer for sectionID is and timestamp for dateSent.
I am passing parameters of these types into a PL/pgSQL procedure, which then
executes a "select into" with these parameters in the where clause.

Take care -
JPL


pgsql-general by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: match one word
Next
From: Tom Lane
Date:
Subject: Re: Debugging plpgsql functions