Re: Help with optimizing a query over hierarchical data - Mailing list pgsql-performance

From Damon Snyder
Subject Re: Help with optimizing a query over hierarchical data
Date
Msg-id CACkQbuhpsjSLnXykKVcieZey7mW5TozW-_j=TTE0Hpr2bxa8YA@mail.gmail.com
Whole thread Raw
In response to Re: Help with optimizing a query over hierarchical data  (Claudio Freire <klaussfreire@gmail.com>)
List pgsql-performance
Hi Claudio,
Thanks for the help!

Damon


On Mon, Mar 3, 2014 at 8:20 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
On Mon, Mar 3, 2014 at 10:12 PM, Damon Snyder <damon@huddler-inc.com> wrote:
>
>> Um... I think your problem is a misuse of CTE. Your CTE is building an
> intermediate of several thousands of rows only to select a dozen
> afterwards. You may want to consider a view or subquery, though I'm
> not sure pg will be able to optimize much given your use of window
> functions, which forces a materialization of that intermediate result.
>
> The application requires that we find an element and it's neighbors within a
> sorted set at a given offset after filtering by category and status. In the
> examples provided, we need position 50000, 6 above, and 6 below. Is there a
> way do to that more efficiently without first determining the position of
> each element within the set using a window function? How would a subquery
> help?
>
> The only solution I could come up with was to materialize the intermediate
> result with the CTE (since you don't know ahead of time how many objects
> match the status and category criteria) then use the window to include the
> position or index.


You're materializing on a per-query basis. That's no good (as your
timings show). Try to find a way to materialize on a more permanent
basis.

I cannot give you a specific solution without investing way more time
than I would. But consider this: all your queries costs are CPU costs.
You need a better algorithm, or better hardware. I doubt you'll find
hardware that performs 16 times faster, so you have to concentrate on
a better algorithm.

And it's unlikely you'll find a better algorithm without a better data
structure. So you need to reorganize your database to make it easier
to query. I don't think simple SQL optimizations will get you to your
performance goal.

pgsql-performance by date:

Previous
From: "acanada"
Date:
Subject: Re: Query taking long time
Next
From: Venkata Balaji Nagothi
Date:
Subject: Re: Query taking long time