Home > mailing lists

Re: Parallel append plan instability/randomness - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: Parallel append plan instability/randomness
Date	January 9, 2018 07:05:30
Msg-id	CAA4eK1Jh+8VXDFaxUF7A4v10sHHzDM0XV8pimJKPVP+2GaBKGg@mail.gmail.com Whole thread Raw
In response to	Re: Parallel append plan instability/randomness (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On Tue, Jan 9, 2018 at 12:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sun, Jan 7, 2018 at 11:40 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> One theory that can explain above failure is that the costs of
>>> scanning some of the sub-paths is very close due to which sometimes
>>> the results can vary.  If that is the case, then probably using
>>> fuzz_factor in costs comparison (as is done in attached patch) can
>>> improve the situation, may be we have to consider some other factors
>>> like number of rows in each subpath.
>
>> This isn't an acceptable solution because sorting requires that the
>> comparison operator satisfy the transitive property; that is, if a = b
>> and b = c then a = c.  With your proposed comparator, you could have a
>> = b and b = c but a < c.  That will break stuff.
>
>> It seems like the obvious fix here is to use a query where the
>> contents of the partitions are such that the sorting always produces
>> the same result.  We could do that either by changing the query or by
>> changing the data in the partitions or, maybe, by inserting ANALYZE
>> someplace.
>
> The foo_star tables are made in create_table.sql, filled in
> create_misc.sql, and not modified thereafter.  The fact that we have
> accurate rowcounts for them in select_parallel.sql is because of the
> database-wide VACUUM that happens at the start of sanity_check.sql.
> Given the lack of any WHERE condition, the costs in this particular query
> depend only on the rowcount and physical table size, so inserting an
> ANALYZE shouldn't (and doesn't, for me) change anything.  I would be
> concerned about side-effects on other queries anyway if we were to ANALYZE
> tables that have never been ANALYZEd in the regression tests before.
>

Fair point.  This seems to indicate that wrong rowcounts is probably
not the reason for the failure.  However, I think it might still be
good to use a different set of tables (probably create new tables with
appropriate data for these queries) and analyze them explicitly before
these queries rather than relying on the execution order of some
not-directly related tests.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Robert Haas
Date: 09 January 2018, 07:02:32
Subject: Re: Condition variable live lock

From: Michael Paquier
Date: 09 January 2018, 07:27:20
Subject: Re: BUG #14941: Vacuum crashes

Re: Parallel append plan instability/randomness - Mailing list pgsql-hackers

Previous

Next