Re: Statistics on array values - Mailing list pgsql-performance

From Marco Colli
Subject Re: Statistics on array values
Date
Msg-id CAFvCgN5JLC55y+Dk_5aMiVw-godC4pnVHALw=S0Lo7PnL++g8w@mail.gmail.com
Whole thread Raw
In response to Re: Statistics on array values  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-performance
Is one of those estimates way off reality, or is it only the conjunction which is deranged?

The estimate is wrong *even with a single tag*, without the conjunction (e.g. expected 3500, actual 20). Then the conjunction can make the bias even worse...

On Sun, Feb 2, 2020 at 3:23 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Sun, Feb 02, 2020 at 03:18:19PM +0100, Marco Colli wrote:
> Hello!
>
> Let's say that you have a simple query like the following on a large table
> (for a multi-tenant application):
> SELECT "subscribers".* FROM "subscribers" WHERE "subscribers"."project_id"
> = 123 AND (tags @> ARRAY['de']::varchar[]);
>
> If you run EXPLAIN ANALYZE you can see that stats are completely wrong.
> For example I get an expected count of 3,500 rows whereas the actual
> result is 20 rows. This also results in bad query plans...

https://www.postgresql.org/message-id/CAMkU%3D1z%2BQijUWAYgeqeyw%2BAvD7adPgOmEnY%2BOcTw6qDVFtD7cQ%40mail.gmail.com
On Fri, Jan 10, 2020 at 12:12:52PM -0500, Jeff Janes wrote:
> Why is the estimate off by so much?  If you run a simple select, what the
> actual and expected number of rows WHERE project_id = 12345?  WHERE tags @>
> '{crt:2018_11}'?  Is one of those estimates way off reality, or is it only
> the conjunction which is deranged?

Could you respond to Jeff's inquiry ?

Justin

pgsql-performance by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Statistics on array values
Next
From: Tom Lane
Date:
Subject: Re: Statistics on array values