Home > mailing lists

Re: How to specify/mock the statistic data of tables in PostgreSQL - Mailing list pgsql-general

From	Felix.徐
Subject	Re: How to specify/mock the statistic data of tables in PostgreSQL
Date	January 13, 2014 06:52:02
Msg-id	CAPmhLM2XXZ1fX75ECR2aJb7sZUPwu4uFvxUn74T0ho0G9K=oiQ@mail.gmail.com Whole thread
In response to	Re: How to specify/mock the statistic data of tables in PostgreSQL (Amit Langote <amitlangote09@gmail.com>)
Responses	Re: How to specify/mock the statistic data of tables in PostgreSQL
List	pgsql-general

Tree view

I see, thanks.

I'm looking into the source code of statistic part now, and I'm a little confused about the column "staop" presented in table pg_statistic,

in the pg_statisitc.h, the comment says:

/* ----------------

* To allow keeping statistics on different kinds of datatypes,

* we do not hard-wire any particular meaning for the remaining

* statistical fields. Instead, we provide several "slots" in which

* statistical data can be placed. Each slot includes:

* kind integer code identifying kind of data (see below)

* op OID of associated operator, if needed

* numbers float4 array (for statistical values)

* values anyarray (for representations of data values)

* The ID and operator fields are never NULL; they are zeroes in an

* unused slot. The numbers and values fields are NULL in an unused

* slot, and might also be NULL in a used slot if the slot kind has

* no need for one or the other.

* ----------------

And,

//line 194 : In a "most common values" slot, staop is the OID of the "=" operator used to decide whether values are the same or not.

//line 206 : A "histogram" slot describes the distribution of scalar data. staop is the OID of the "<" operator that describes the sort ordering.

....

I don't understand the function of staop here, how is it used in optimizer, is there any example ? thanks!

2014/1/10 Amit Langote <amitlangote09@gmail.com>

On Fri, Jan 10, 2014 at 11:19 PM, Atri Sharma <atri.jiit@gmail.com> wrote:
>
>
> Sent from my iPad
>
> On 10-Jan-2014, at 19:42, "ygnhzeus" <ygnhzeus@gmail.com> wrote:
>
> Thanks for your reply.
> So correlation is not related to the calculation of selectivity right? If I
> force PostgreSQL not to optimize the join order (by setting
> join_collapse_limit and from_collapse_limit to 1) , is there any other
> factor that may affect the structure of execution plan regardless of the
> data access method.
>
> 2014-01-10
> ________________________________
> ygnhzeus
> ________________________________
> 发件人：Amit Langote <amitlangote09@gmail.com>
> 发送时间：2014-01-10 22:00
> 主题：Re: [GENERAL] How to specify/mock the statistic data of tables in
> PostgreSQL
> 收件人："ygnhzeus"<ygnhzeus@gmail.com>
> 抄送："pgsql-general"<pgsql-general@postgresql.org>
>
>
>
> AFAIK, correlation is involved in calculation of the costs that are used for
> deciding the type of access.If the correlation is low, index scan can lead
> to quite some random reads, hence leading to higher costs.
>

Ah, I forgot to mention this point about how planner uses correlation
for access method selection.

And selectivity is a function of statistical distribution of column
values described in pg_statistic by histograms, most common values
(with their occurrence frequencies), number of distinct values, etc.
It has nothing to do with correlation.

--
Amit Langote

pgsql-general by date:

From: Chris Travers
Date: 13 January 2014, 04:24:19
Subject: Re: DB Authentication Design

From: "Abraham, Danny"
Date: 13 January 2014, 09:44:39
Subject: Re: PG 924, Windows 2012, error code 487

Re: How to specify/mock the statistic data of tables in PostgreSQL - Mailing list pgsql-general

Previous

Next