Re: significant slow down with various LIMIT - Mailing list pgsql-performance

From Kevin Grittner
Subject Re: significant slow down with various LIMIT
Date
Msg-id 4BC2C7140200002500030705@gw.wicourts.gov
Whole thread Raw
In response to significant slow down with various LIMIT  (Helio Campos Mello de Andrade <helio.campos@gmail.com>)
List pgsql-performance
[rearranging to put related information together]

norn

Since the LIMIT 3 and LIMIT 4 queries generated exactly the same
plan, the increased time for LIMIT 4 suggests that there are 3
matching rows which are near the end of the index it is scanning, but
the fourth one is much farther in.

Since what you're showing suggests that the active portion of your
data is heavily cached, you might benefit from decreasing
random_page_cost, and possibly also seq_page_cost.

> 8GB RAM

> effective_cache_size = 1536MB

> Please also note that this hardware isn't dedicated DB server, but
> also serve as web server and file server.

Even with those other uses, you're likely to actually be using 6 GB
or 7 GB for cache.  I'd set effective_cache_size in that range.

> max_connections = 250
> work_mem = 128MB

While probably not related to this problem, that's a dangerous
combination.  What if all 250 connections are active with a query
which uses work_mem memory?  A single connection can actually be
using several work_mem allocations at once.

> 2 SATA 750GB (pg db installed in software RAID 0)

You do realize that if either drive dies  you lose all your data on
that pair of drives, right?  I hope the value of the data and well
tested backup procedures keeps the loss to something which is
acceptable.

> I have about 3 million rows in core_object, 1.5 million in
> plugin_plugin_addr and 1.5 million in plugins_guide_address.
> When there were 300 000+ objects queries works perfectly, but as db
> enlarge things go worse...

With a relational database, it's not unusual for the most efficient
plan to depend on the quantity of data in the tables.  It is
important that your statistics are kept up-to-date so that plans can
adapt to the changing table sizes or data distributions.  The
effective_cache_size and cost parameters are also used to calculate
the costs of various plans, so adjusting those may help the optimizer
make good choices.

-Kevin


pgsql-performance by date:

Previous
From: Helio Campos Mello de Andrade
Date:
Subject: significant slow down with various LIMIT
Next
From: Robert Haas
Date:
Subject: Re: function performs differently with different values