Re: Parallel append plan instability/randomness - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel append plan instability/randomness
Date
Msg-id CAA4eK1Kxpw4_UXpT0NUWF7zY6_kLxTU9Km9Ebfhc4No3kr40NQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel append plan instability/randomness  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Responses Re: Parallel append plan instability/randomness
List pgsql-hackers
On Mon, Jan 8, 2018 at 2:11 PM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
> On 8 January 2018 at 13:35, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Mon, Jan 8, 2018 at 11:26 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Amit Khandekar <amitdkhan.pg@gmail.com> writes:
>>>> The fact that b_star gets moved from 5th position to  the first
>>>> position in the scans, indicates that it's cost shoots up from 1.04 to
>>>> a value greater than 1.16. It does not look like a case where two
>>>> costs are almost same due to which their positions swap sometimes. I
>>>> am trying to figure out what else can it be ...
>>>
>>
>> That occurred to me as well, but still, the change in plan can happen
>> due to the similar costs.
>
> Agreed. But I think we should first fix the issue due to which the
> test might be failing in this case. BTW, for your patch, I am thinking
> we can have a separate factor other than STD_FUZZ_FACTOR ? This way,
> we can make it much smaller than 1.01 also.
>

Sounds good.  However, I am not sure what should be the right value
for it, apart from STD_FUZZ_FACTOR we are using 1.0000000001 also as
fuzz_factor in the code.  Any suggestions?

 And anyways,
> STD_FUZZ_FACTOR is used only for comparing paths on the same relation,
> whereas in our case, our comparision goal is different.
>
>> Another possibility as indicated in the
>> previous email is that if somehow the stats of table (reltuples,
>> relpages) is not appropriate, say due to some reason analyze doesn't
>> happen on the table.
>
> Yes, I am also thinking on the same lines. E.g., if the relpages is 0
> (due to no analyze run yet), tuple density calculation follows a
> different logic, due to which reltuples can be quite bigger. I suspect
> this also might be the reason. So yes, I think it's worth having
> ANALYZE on *_star.
>
>> For example, if you just manually update the
>> value of reltuples for b_star in pg_class to 20 or so, you will see
>> the plan as indicated in the failure.  If that is true, then probably
>> doing Analyze before Parallel Append should do the trick.
>
>  Or better still, we can have Analyze in create_misc.sql and
> create_table.sql where the table is populated.
>

Then probably we might want to do in misc.sql file as that Alters some
of these tables which can lead to a rewrite of the table.  I think we
can do it either way, but lets first wait for Tom or Robert to comment
on whether they agree with this theory or if they see some other
reason for this behavior.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Amit Khandekar
Date:
Subject: Re: Parallel append plan instability/randomness
Next
From: Antonin Houska
Date:
Subject: Re: [HACKERS] [PATCH] Incremental sort