Re: Select Last n Rows Matching an Index Condition (and caches) - Mailing list pgsql-general

From Alex Stapleton
Subject Re: Select Last n Rows Matching an Index Condition (and caches)
Date
Msg-id CMEKJGNLDMNLHLKAEDDEMEMDHHAA.alexs@advfn.com
Whole thread Raw
In response to Re: Select Last n Rows Matching an Index Condition (and caches)  ("Hegyvari Krisztian" <hegyvari.krisztian@ardents.hu>)
Responses Re: Select Last n Rows Matching an Index Condition (and caches)  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-general
If I SET ENABLE_SEQSCAN TO OFF it uses the Index, but it does an Index scan
backward if I do

SELECT * FROM article WHERE symbol=12646 AND source = 19 ORDER BY
symbol desc, source DESC, time DESC LIMIT 1000;

Which is better but still quite slow.

-----Original Message-----
From: Hegyvari Krisztian [mailto:hegyvari.krisztian@ardents.hu]
Sent: 18 March 2005 10:25
To: Alex Stapleton; pgsql-general@postgresql.org
Subject: RE: [GENERAL] Select Last n Rows Matching an Index Condition
(and caches)


Does not it look like the index you are actually using is on article and
then PG has to filter for symbol and source?

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alex Stapleton
Sent: Friday, March 18, 2005 10:57 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Select Last n Rows Matching an Index Condition
(and caches)

Woops sorry we have indexes on (symbol, source, time) and there is no
date
column :/

SELECT * FROM article WHERE symbol=12646 AND source = 19 ORDER BY time
DESC
LIMIT 1000;
                                                     QUERY PLAN
------------------------------------------------------------------------
----
----------------------------------------
 Limit  (cost=321163.29..321165.79 rows=1000 width=466) (actual
time=33243.924..33246.021 rows=1000 loops=1)
   ->  Sort  (cost=321163.29..321402.31 rows=95609 width=466) (actual
time=33243.917..33244.626 rows=1000 loops=1)
         Sort Key: "time"
         ->  Seq Scan on article  (cost=0.00..301724.00 rows=95609
width=466) (actual time=0.022..32979.685 rows=42959 loops=1)
               Filter: ((symbol = 12646) AND (source = 19))
 Total runtime: 33258.706 ms
(6 rows)


explain analyze SELECT * FROM article WHERE symbol=12646 AND source =
19;
                                                     QUERY PLAN
------------------------------------------------------------------------
----
----------------------------------------
 Seq Scan on article  (cost=0.00..301724.00 rows=95609 width=466)
(actual
time=0.021..33275.433 rows=42959 loops=1)
   Filter: ((symbol = 12646) AND (source = 19))
 Total runtime: 33320.920 ms
(3 rows)

We can't use CLUSTER because we need the DB up all the time.

The analyze suggests that it's the I/O taking most of the time to me.


-----Original Message-----
From: Alban Hertroys [mailto:alban@magproductions.nl]
Sent: 18 March 2005 09:48
To: Alex Stapleton
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Select Last n Rows Matching an Index Condition
(and caches)


Alex Stapleton wrote:
> SELECT * FROM article WHERE symbol=12646 AND source = 19 ORDER BY time
DESC
> LIMIT 1000;
>
> To get the latest 1000 rows for that symbol and source.

> My (not yet implemented) solution to this problem is to add a SEQUENCE
and
> index it so that by adding a WHERE id > [max_id]-1000 and ordering by
time
> DESC will reduce the I/O quite a lot. Am I right here? It would be
nice if
> there was a way to get PostgreSQL to try and precache the tables pages
as
> well, is there anyway I could achieve something like that? I have
toyed
with
> creating a ramdisk to store a lookup table of sorts on (we only care
about
a
> few columns initially) to speed this up a bit but its a right pain in
the
> arse to do by the looks of things.

First question that always gets asked here: What's the output of explain
analyse? Without that, people here can't see where the slowdown is.

I expect though, that the problem is the ordering by time. I imagine
that you could create an index on time, maybe truncated to months or
something similar (You can create indices based on functions). That
index alone should speed up the ordering already.
It could also be used to cluster the table, which should speed up things
some more, I suppose.

--
Alban Hertroys
MAG Productions

T: +31(0)53 4346874
F: +31(0)53 4346876
E: alban@magproductions.nl
W: http://www.magproductions.nl


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to majordomo@postgresql.org so that your
      message can get through to the mailing list cleanly




pgsql-general by date:

Previous
From: "Hegyvari Krisztian"
Date:
Subject: Re: Select Last n Rows Matching an Index Condition (and caches)
Next
From: "Alex Stapleton"
Date:
Subject: Re: Select Last n Rows Matching an Index Condition (and caches)