Home > mailing lists

Re: Query tuning: partitioning, DISTINCT ON, and indexing - Mailing list pgsql-performance

From	Maciek Sakrejda
Subject	Re: Query tuning: partitioning, DISTINCT ON, and indexing
Date	June 21, 2013 05:14:46
Msg-id	CAOtHd0DnNT=eHyR0VS6eHP7zthTgEUxN6j35nrh5z5THze7PkA@mail.gmail.com Whole thread
In response to	Re: Query tuning: partitioning, DISTINCT ON, and indexing (bricklen <bricklen@gmail.com>)
Responses	Re: Query tuning: partitioning, DISTINCT ON, and indexing
List	pgsql-performance

Tree view

On Thu, Jun 20, 2013 at 9:13 PM, bricklen <bricklen@gmail.com> wrote:

On Thu, Jun 20, 2013 at 6:24 PM, Maciek Sakrejda <m.sakrejda@gmail.com> wrote:
SELECT
DISTINCT ON (type) ts, type, details
FROM
observations
WHERE
subject = '...'
ORDER BY
type, ts DESC;

First thing: What is your "work_mem" set to, and how much RAM is in the machine? If you look at the plan, you'll immediately notice the "external merge Disk" line where it spills to disk on the sort. Try setting your work_mem to 120MB or so (depending on how much RAM you have, # concurrent sessions, complexity of queries etc)

Good call, thanks, although the in-mem quicksort is not much faster:

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Unique (cost=471248.30..489392.67 rows=3 width=47) (actual time=32002.133..32817.474 rows=3 loops=1)

Buffers: shared read=30264

-> Sort (cost=471248.30..480320.48 rows=3628873 width=47) (actual time=32002.128..32455.950 rows=3628803 loops=1)

Sort Key: public.observations.type, public.observations.ts

Sort Method: quicksort Memory: 381805kB

Buffers: shared read=30264

-> Result (cost=0.00..75862.81 rows=3628873 width=47) (actual time=0.026..1323.317 rows=3628803 loops=1)

Buffers: shared read=30264

-> Append (cost=0.00..75862.81 rows=3628873 width=47) (actual time=0.026..978.477 rows=3628803 loops=1)

Buffers: shared read=30264

...

the machine is not nailed down, but I think I'd need to find a way to drastically improve the plan to keep this in Postgres. The alternative is probably caching the results somewhere else: for any given subject, I only need the latest observation of each type 99.9+% of the time.

pgsql-performance by date:

From: bricklen
Date: 21 June 2013, 04:13:16
Subject: Re: Query tuning: partitioning, DISTINCT ON, and indexing

From: bricklen
Date: 21 June 2013, 16:08:45
Subject: Re: Query tuning: partitioning, DISTINCT ON, and indexing

Re: Query tuning: partitioning, DISTINCT ON, and indexing - Mailing list pgsql-performance

Previous

Next