Re: Creating tons of tables to support a query - Mailing list pgsql-general

From Jan Ploski
Subject Re: Creating tons of tables to support a query
Date
Msg-id 14629117.1031568752666.JavaMail.jpl@remotejava
Whole thread Raw
In response to Creating tons of tables to support a query  (Jan Ploski <jpljpl@gmx.de>)
Responses Re: Creating tons of tables to support a query  (Richard Huxton <dev@archonet.com>)
Re: Creating tons of tables to support a query  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Sun, Sep 08, 2002 at 07:49:32PM -0700, Stephan Szabo wrote:
> On Sun, 8 Sep 2002, Jan Ploski wrote:
>
> > I am in particular wondering, why an index on message(sectionID, dateSent)
> > does not make these queries comparably fast:
> >
> >     select msgnum from message where
> >         sectionID = ? and
> >         dateSent > ?
> >         order by dateSent
> >         limit 1;
>
> I don't think that'll use an index on (sectionID, dateSent) for the sort
> step.  I think an index on (dateSent,sectionID) might be, however.

Stephan,

Indeed, my mistake. With an index on (dateSent,sectionID), the plan becomes:

Limit  (cost=0.00..2.36 rows=1 width=10)
  ->  Index Scan using test_idx2 on message  (cost=0.00..10482.08 rows=4449 width=10)

Alas, this does not help me further. I did two tests:

Test 1: Section 9 contained 5143 messages.
Test 2: Section 241 contained 0 messages.

The timing results (for 5000 queries) are:

1. Using index on message(dateSent, sectionID):      11 seconds
   Using index on scnt_9(dateSent):                  17 seconds

2. Using index on message(dateSent, sectionID):     320 seconds
   Using index on scnt_241(dateSent):                 2 seconds


The problem is that (apparently?) the whole (dateSent, sectionID) index
must be scanned in the second test, while the scnt_241 index simply
contains no values and yields quick results.

Since the auxiliary tables speed up things so much and behave well
for sections with few messages, I tend to believe that this is the
way to go for me. Two questions remain open: what kind of overheads
do I incur by creating that many tables (hundreds, maybe thousands
in the future)? And, second, since there is no support for pl/pgSQL
"execute select ... into" in 7.1.3, I need to collect results
inserted into a temporary table. Is this kind of "execute" statement
implemented in the newest version of PostgreSQL yet?

Take care -
JPL


pgsql-general by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: Performance Tuning Question
Next
From: Richard Huxton
Date:
Subject: Re: Creating tons of tables to support a query