Re: Help with optimizing a query over hierarchical data - Mailing list pgsql-performance

From Claudio Freire
Subject Re: Help with optimizing a query over hierarchical data
Date
Msg-id CAGTBQpYM=BmLJ5hr8UXE9O0Ks0v3aO8hQMX4DhG-5e3H2_2aMw@mail.gmail.com
Whole thread Raw
In response to Re: Help with optimizing a query over hierarchical data  (Damon Snyder <damon@huddler-inc.com>)
Responses Re: Help with optimizing a query over hierarchical data
List pgsql-performance
Um... I think your problem is a misuse of CTE. Your CTE is building an
intermediate of several thousands of rows only to select a dozen
afterwards. You may want to consider a view or subquery, though I'm
not sure pg will be able to optimize much given your use of window
functions, which forces a materialization of that intermediate result.

I think you need to re-think your queries to be smarter about that.

On Mon, Mar 3, 2014 at 2:55 PM, Damon Snyder <damon@huddler-inc.com> wrote:
> Hi Claudio,
> Thanks for responding. Here is the explain (http://explain.depesz.com/s/W3W)
> for the ordering by meta container starting on line 192
> (https://gist.github.com/drsnyder/9277054#file-object-ordering-setup-sql-L192).
>
> Here is the explain (http://explain.depesz.com/s/d1O) for the ordering by
> score starting on line 192
> (https://gist.github.com/drsnyder/9277054#file-object-ordering-setup-sql-L216).
>
> Both of the explains were done with (ANALYZE, BUFFERS).
>
> Thanks for the suggestion regarding de-normalizing. I'll consider that
> approach for the score based query.
>
> I've also included the server config changes made from updates to
> postgresql.conf on the box that I'm testing on. See below.
>
> Thanks,
> Damon
>
>                                                   version
> --------------------------------------------------------------------------------------------------------------
>  PostgreSQL 9.2.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7
> 20120313 (Red Hat 4.4.7-3), 64-bit
> (1 row)
>
>              name             |  current_setting   |        source
> ------------------------------+--------------------+----------------------
>  application_name             | psql               | client
>  checkpoint_completion_target | 0.9                | configuration file
>  checkpoint_segments          | 16                 | configuration file
>  DateStyle                    | ISO, MDY           | configuration file
>  default_tablespace           | ssd2               | user
>  default_text_search_config   | pg_catalog.english | configuration file
>  effective_cache_size         | 5632MB             | configuration file
>  lc_messages                  | en_US.UTF-8        | configuration file
>  lc_monetary                  | en_US.UTF-8        | configuration file
>  lc_numeric                   | en_US.UTF-8        | configuration file
>  lc_time                      | en_US.UTF-8        | configuration file
>  listen_addresses             | *                  | configuration file
>  log_destination              | stderr             | configuration file
>  log_directory                | pg_log             | configuration file
>  log_filename                 | postgresql-%a.log  | configuration file
>  log_line_prefix              | %d %m %c %x:       | configuration file
>  log_min_duration_statement   | 500ms              | configuration file
>  log_min_error_statement      | error              | configuration file
>  log_min_messages             | error              | configuration file
>  log_rotation_age             | 1d                 | configuration file
>  log_rotation_size            | 0                  | configuration file
>  log_timezone                 | UTC                | configuration file
>  log_truncate_on_rotation     | on                 | configuration file
>  logging_collector            | on                 | configuration file
>  maintenance_work_mem         | 480MB              | configuration file
>  max_connections              | 80                 | configuration file
>  max_stack_depth              | 2MB                | environment variable
>  port                         | 5432               | command line
>  shared_buffers               | 1920MB             | configuration file
>  TimeZone                     | UTC                | configuration file
>  wal_buffers                  | 16MB               | configuration file
>  work_mem                     | 8MB                | configuration file
> (32 rows)
>
>
>
> On Sat, Mar 1, 2014 at 5:02 PM, Claudio Freire <klaussfreire@gmail.com>
> wrote:
>>
>> On Fri, Feb 28, 2014 at 5:01 PM, Damon Snyder <damon@huddler-inc.com>
>> wrote:
>> > The primary query that I'm trying to optimize executes in about 1600ms
>> > on my
>> > laptop and about 800ms on production-like hardware (more for the score
>> > version). My target is to get the data fetch down below 100ms if
>> > possible.
>>
>> Could you post some explain analyze of those particular queries?
>>
>> > If you have any suggestions it would be greatly appreciated. Am I
>> > missing
>> > something obvious? Is there a logically equivalent alternative that
>> > would be
>> > more efficient?
>>
>> I'd suggest de-normalizing a bit. For instance, why don't you put the
>> score right into the object? I'm sure the indirection is hurting.
>
>


pgsql-performance by date:

Previous
From: Eli Naeher
Date:
Subject: Subselect an order of magnitude faster than nested joins
Next
From: Venkata Balaji Nagothi
Date:
Subject: Re: Query taking long time