Thread: Performance problems with DISTINCT ON

Performance problems with DISTINCT ON

From

"Sgarbossa Domenico"

Date:

03 October 2009, 23:04:39

I need to retrieve the most recent prices per products from a price list table:

CREATE TABLE listini_anagrafici
(
id character varying(36) NOT NULL,
articolo character varying(18),
listino character varying(5),
data_ent date,
data_fin date,
prezzo double precision,
ultimo boolean DEFAULT false,
date_entered timestamp without time zone NOT NULL,
date_modified timestamp without time zone NOT NULL,
created_by character varying(36),
modified_user_id character varying(36) NOT NULL,
deleted boolean NOT NULL DEFAULT false,
CONSTRAINT listini_anagrafici_id_key UNIQUE (id)
)

I guess the right query is:

select distinct on (articolo) articolo,data_ent,prezzo from listini_anagrafici order by articolo, data_ent desc

but it seems that this query runs slowly... about 5/6 seconds.

the table contains more or less 500K records, PostgreSQL version is 8.1.11 and the server has 4gb of RAM entirely dedicate to the db.

I've tried adding this index

CREATE INDEX articolo_data_ent ON listini_anagrafici (articoli, data_ent)

but it doesn't helps.

As you can see from the explain command (below) the query seems to ignore the index

'Unique (cost=73897.58..76554.94 rows=77765 width=24)'
' -> Sort (cost=73897.58..75226.26 rows=531472 width=24)'
' Sort Key: articolo, data_ent'
' -> Seq Scan on listini_anagrafici (cost=0.00..16603.72 rows=531472 width=24)'

anyone knows how to make this query run faster?

Re: Performance problems with DISTINCT ON

From

imad

Date:

03 October 2009, 23:37:20

The index can produce the sorted output. Add a dummy WHERE clause like
articoli > <min_value> and data_ent > <min_value>.


--Imad

On Mon, Sep 28, 2009 at 10:18 PM, Sgarbossa Domenico
<domenico.sgarbossa@eniac.it> wrote:
>
> I need to retrieve the most recent prices per products from a price list
> table:
>
> CREATE TABLE listini_anagrafici
> (
>   id character varying(36) NOT NULL,
>   articolo character varying(18),
>   listino character varying(5),
>   data_ent date,
>   data_fin date,
>   prezzo double precision,
>   ultimo boolean DEFAULT false,
>   date_entered timestamp without time zone NOT NULL,
>   date_modified timestamp without time zone NOT NULL,
>   created_by character varying(36),
>   modified_user_id character varying(36) NOT NULL,
>   deleted boolean NOT NULL DEFAULT false,
>   CONSTRAINT listini_anagrafici_id_key UNIQUE (id)
> )
>
> I guess the right query is:
>
> select distinct on (articolo) articolo,data_ent,prezzo from
> listini_anagrafici order by articolo, data_ent desc
>
> but it seems that this query runs slowly... about 5/6 seconds.
> the table contains more or less 500K records, PostgreSQL version is 8.1.11
> and the server has 4gb of RAM entirely dedicate to the db.
>
> I've tried adding this index
>
> CREATE INDEX articolo_data_ent ON listini_anagrafici (articoli, data_ent)
>
> but it doesn't helps.
>
> As you can see from the explain command (below) the query seems to ignore
> the index
>
> 'Unique  (cost=73897.58..76554.94 rows=77765 width=24)'
> '  ->  Sort  (cost=73897.58..75226.26 rows=531472 width=24)'
> '        Sort Key: articolo, data_ent'
> '        ->  Seq Scan on listini_anagrafici  (cost=0.00..16603.72
> rows=531472 width=24)'
>
> anyone knows how to make this query run faster?
>
>
>
>

Re: Performance problems with DISTINCT ON

From

Tom Lane

Date:

03 October 2009, 23:53:15

"Sgarbossa Domenico" <domenico.sgarbossa@eniac.it> writes:
> I guess the right query is:

> select distinct on (articolo) articolo,data_ent,prezzo from listini_anagrafici order by articolo, data_ent desc

> but it seems that this query runs slowly... about 5/6 seconds.

> I've tried adding this index
> CREATE INDEX articolo_data_ent ON listini_anagrafici (articoli, data_ent)
> but it doesn't helps.

That index doesn't match the query ordering.  You could do

select distinct on (articolo) articolo,data_ent,prezzo from listini_anagrafici order by articolo desc, data_ent desc

In more recent versions of Postgres you could make an index with one
column ascending and the other descending, but AFAIR 8.1 doesn't have
that.

            regards, tom lane