Re: Optimizing count(), but Explain estimates wildly off - Mailing list pgsql-performance

From Laurenz Albe
Subject Re: Optimizing count(), but Explain estimates wildly off
Date
Msg-id 4bba24a3f55291fd8a17460ed0b3239e3a022c24.camel@cybertec.at
Whole thread Raw
In response to Optimizing count(), but Explain estimates wildly off  (Chema <chema@interneta.org>)
Responses Re: Optimizing count(), but Explain estimates wildly off  (Chema <chema@interneta.org>)
List pgsql-performance
On Mon, 2024-02-26 at 18:25 -0600, Chema wrote:
> I'm trying to optimize simple queries on two tables (tenders & items) with a couple
> million records.  Besides the resulting records, the app also displays the count of
> total results.  Doing count() takes as much time as the other query (which can be
> 30+ secs), so it's an obvious target for optimization.
>
> Reading around, seems many people are still using this 2005 snippet to obtain the
> row count estimate from Explain:

I recommend using FORMAT JSON and extracting the top row count from that.  It is
simpler and less error-prone.

> Is this still the current best practice?  Any tips to increase precision?
> Currently it can estimate the actual number of rows for over or under a million,
> as seen on the sample queries (1,955,297 instead of 1,001,200; 162,080 instead
> of 1,292,010).

Looking at the samples you provided, I get the impression that the statistics for
the table are quite outdated.  That will affect the estimates.  Try running ANALYZE
and see if that improves the estimates.

Yours,
Laurenz Albe



pgsql-performance by date:

Previous
From: Vitalii Tymchyshyn
Date:
Subject: Re: Optimizing count(), but Explain estimates wildly off
Next
From: Alvaro Herrera
Date:
Subject: Re: Optimizing count(), but Explain estimates wildly off