Re: Performance problems with DISTINCT ON - Mailing list pgsql-performance

From imad
Subject Re: Performance problems with DISTINCT ON
Date
Msg-id 1f30b80c0910031927k281d1d96oaeed1b1b8a40c606@mail.gmail.com
Whole thread Raw
In response to Performance problems with DISTINCT ON  ("Sgarbossa Domenico" <domenico.sgarbossa@eniac.it>)
List pgsql-performance
The index can produce the sorted output. Add a dummy WHERE clause like
articoli > <min_value> and data_ent > <min_value>.


--Imad

On Mon, Sep 28, 2009 at 10:18 PM, Sgarbossa Domenico
<domenico.sgarbossa@eniac.it> wrote:
>
> I need to retrieve the most recent prices per products from a price list
> table:
>
> CREATE TABLE listini_anagrafici
> (
>   id character varying(36) NOT NULL,
>   articolo character varying(18),
>   listino character varying(5),
>   data_ent date,
>   data_fin date,
>   prezzo double precision,
>   ultimo boolean DEFAULT false,
>   date_entered timestamp without time zone NOT NULL,
>   date_modified timestamp without time zone NOT NULL,
>   created_by character varying(36),
>   modified_user_id character varying(36) NOT NULL,
>   deleted boolean NOT NULL DEFAULT false,
>   CONSTRAINT listini_anagrafici_id_key UNIQUE (id)
> )
>
> I guess the right query is:
>
> select distinct on (articolo) articolo,data_ent,prezzo from
> listini_anagrafici order by articolo, data_ent desc
>
> but it seems that this query runs slowly... about 5/6 seconds.
> the table contains more or less 500K records, PostgreSQL version is 8.1.11
> and the server has 4gb of RAM entirely dedicate to the db.
>
> I've tried adding this index
>
> CREATE INDEX articolo_data_ent ON listini_anagrafici (articoli, data_ent)
>
> but it doesn't helps.
>
> As you can see from the explain command (below) the query seems to ignore
> the index
>
> 'Unique  (cost=73897.58..76554.94 rows=77765 width=24)'
> '  ->  Sort  (cost=73897.58..75226.26 rows=531472 width=24)'
> '        Sort Key: articolo, data_ent'
> '        ->  Seq Scan on listini_anagrafici  (cost=0.00..16603.72
> rows=531472 width=24)'
>
> anyone knows how to make this query run faster?
>
>
>
>

pgsql-performance by date:

Previous
From: Justin Pryzby
Date:
Subject: dump time increase by 1h with new kernel
Next
From: Karl Denninger
Date:
Subject: Re: Best suiting OS