Home > mailing lists

Re: Histogram question. - Mailing list pgsql-sql

From	Steve Midgley
Subject	Re: Histogram question.
Date	April 5, 2022 18:14:57
Msg-id	CAJexoS+NPAb4BLRSRYP=qNwaq3A591D6run1Qwa85mV_10ehYA@mail.gmail.com Whole thread Raw
In response to	Histogram question. (Jian He <hejian.mark@gmail.com>)
List	pgsql-sql

Tree view

On Tue, Apr 5, 2022 at 7:35 AM Jian He <hejian.mark@gmail.com> wrote:

Queries in PostgreSQL: 2. Statistics : Postgres Professional

SELECT sum(s.most_common_freqs[ array_position((s.most_common_vals::text::text[]),v) ]) FROM pg_stats s, unnest(s.most_common_vals::text::text[]) v WHERE s.tablename = 'boarding_passes' AND s.attname = 'seat_no';

return 0.6762.

SELECT sum(s.most_common_freqs[ array_position((s.most_common_vals::text::text[]),v) ]) FROM pg_stats s, unnest(s.most_common_vals::text::text[]) v WHERE s.tablename = 'boarding_passes' AND s.attname = 'seat_no' AND v > '30C';

return 0.2127

SELECT round( reltuples * ( 0.2127 -- from most common values + (1 - 0.6762 - 0) * (49 / 100.0) -- from histogram )) FROM pg_class WHERE relname = 'boarding_passes';

the above mentioned query, the part I don't understand is 49/100.

I believe the exercise is intended to create a set of histograms based on data values over a series of intervals. The 49/100 (if I'm reading the source material correctly) refers to finding all the boarding passes in the lower 49 of 100 intervals. I didn't bother to read what the interval definition is, but I think that's what the "49" is referring to..

pgsql-sql by date:

From: Jian He
Date: 05 April 2022, 17:34:44
Subject: Histogram question.

From: Jian He
Date: 06 April 2022, 07:07:55
Subject: Does postgresql know the check condition is valid or not. or can check deduce from multiple conditions

Re: Histogram question. - Mailing list pgsql-sql

Previous

Next