Re: [HACKERS] Proposal: GetOldestXminExtend for ignoring arbitraryvacuum flags - Mailing list pgsql-hackers

From Haribabu Kommi
Subject Re: [HACKERS] Proposal: GetOldestXminExtend for ignoring arbitraryvacuum flags
Date
Msg-id CAJrrPGen1bJYRHu7VFp13QZUyaLdX5N4AH1cqQdiNd8uLVZWow@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Proposal: GetOldestXminExtend for ignoring arbitraryvacuum flags  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [HACKERS] Proposal: GetOldestXminExtend for ignoring arbitraryvacuum flags  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers


On Wed, Feb 15, 2017 at 11:35 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Feb 15, 2017 at 12:03 PM, Seki, Eiji <seki.eiji@jp.fujitsu.com> wrote:
> Amit Kapila wrote:
>> How will you decide just based on oldest xmin whether the tuple is visible or not?  How will you take decisions about tuples which have xmax set?
>
> In our use case, GetOldestXmin is used by an original maintainer process[es] to an original control table[s]. The table can be concurrently read or inserted in any transactions. However, rows in the table can be deleted (set xmax) only by the maintainer process. Then, one control table can be processed by only one maintainer process at once.
>
> So I do MVCC as following.
>
> - The maintainer's transaction:
>   - If xmax is set, simply ignore the tuple.
>   - For other tuples, read tuples if GetOldestXmin() > xmin.
> - Other transactions: Do ordinal MVCC using his XID.
>

Oh, this is a very specific case for which such an API can be useful.
Earlier, I have seen that people proposing some kind of hooks which
can be used for their specific purpose but exposing an API or changing
the signature of an API sound bit awkward.  Consider tomorrow someone
decides to change this API for some reason, it might become difficult
to decide because we can't find it's usage.

The proposed change of new API is in the context of fixing the performance
problem of vertical clustered index feature [1].

During the performance test of VCI in parallel with OLTP queries, sometimes
the query performance is getting dropped because of not choosing the VCI
as the best plan. This is due to increase of more records in WOS relation
that are needed to be moved to ROS. If there are more records in WOS
relation, the it takes more time to generate the LOCAL ROS, because of
this reason, the VCI plan is not chosen.

Why there are more records in WOS? We used GetOldestXmin() function
to identify the minimum transaction that is present in the cluster in order to
move the data from WOS to ROS. This function doesn't ignore the ANALYZE
transactions that are running in the system. As these transactions doesn't
do any changes that will affect the data movement from WOS to ROS.

Because of the above reason, we need a new API or some change in API
to provide the Oldest xmin by ignoring the ANALYZE transactions, so that
it will reduce the size of WOS and improves the VCI query performance.


Regards,
Hari Babu
Fujitsu Australia

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] increasing the default WAL segment size
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] increasing the default WAL segment size