Thread: Re: [WIP] Performance Improvement by reducing WAL for Update Operation

Re: [WIP] Performance Improvement by reducing WAL for Update Operation

From
Heikki Linnakangas
Date:
On 04.08.2012 11:01, Amit Kapila wrote:
> Missed one point which needs to be handled is pg_upgrade

I don't think there's anything to do for pg_upgrade. This doesn't change 
the on-disk data format, just the WAL format, and pg_upgrade isn't 
sensitive to WAL format changes.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: [WIP] Performance Improvement by reducing WAL for Update Operation

From
Bruce Momjian
Date:
On Sat, Aug  4, 2012 at 05:21:06PM +0300, Heikki Linnakangas wrote:
> On 04.08.2012 11:01, Amit Kapila wrote:
> >Missed one point which needs to be handled is pg_upgrade
> 
> I don't think there's anything to do for pg_upgrade. This doesn't
> change the on-disk data format, just the WAL format, and pg_upgrade
> isn't sensitive to WAL format changes.

Correct.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: [WIP] Performance Improvement by reducing WAL for Update Operation

From
Amit Kapila
Date:
From: Bruce Momjian [mailto:bruce@momjian.us] 
Sent: Saturday, August 04, 2012 8:06 PM
On Sat, Aug  4, 2012 at 05:21:06PM +0300, Heikki Linnakangas wrote:On 04.08.2012 11:01, Amit Kapila wrote:
>>Missed one point which needs to be handled is pg_upgrade
> 
> I don't think there's anything to do for pg_upgrade. This doesn't
> change the on-disk data format, just the WAL format, and pg_upgrade
> isn't sensitive to WAL format changes.

>Correct.

Thanks Bruce and Heikki for this information. 

I need your feedback on the below design point, as it will make my further
work on this performance issue more clear.
Also let me know if the explanation below is not clear, I shall try to use
some examples to explain my point.

Currently the solution for fixed length columns cannot handle the case of
variable length columns and NULLS. The reason is for fixed length columns
there is no need of diff technology between old and new tuple, however for
other cases it will be required.
For fixed length columns, if we just note the OFFSET, LENGTH, VALUE of
changed columns of new tuple in WAL, it will be sufficient to do the replay
of WAL. However to handle other cases we need to use diff mechanism.

Can we do something like if the changed columns are fixed length and doesn't
contain NULL's, then store [OFFSET, LENGTH, VALUE] format in WAL and for
other cases store diff format.

This has advantage that for Updates containing only fixed length columns
don't have to pay penality of doing diff between new and old tuple. Also we
can do the whole work in 2 parts, one for fixed length columns and second to
handle other cases.


With Regards,
Amit Kapila.



Re: [WIP] Performance Improvement by reducing WAL for Update Operation

From
Heikki Linnakangas
Date:
On 06.08.2012 06:10, Amit Kapila wrote:
> Currently the solution for fixed length columns cannot handle the case of
> variable length columns and NULLS. The reason is for fixed length columns
> there is no need of diff technology between old and new tuple, however for
> other cases it will be required.
> For fixed length columns, if we just note the OFFSET, LENGTH, VALUE of
> changed columns of new tuple in WAL, it will be sufficient to do the replay
> of WAL. However to handle other cases we need to use diff mechanism.
>
> Can we do something like if the changed columns are fixed length and doesn't
> contain NULL's, then store [OFFSET, LENGTH, VALUE] format in WAL and for
> other cases store diff format.
>
> This has advantage that for Updates containing only fixed length columns
> don't have to pay penality of doing diff between new and old tuple. Also we
> can do the whole work in 2 parts, one for fixed length columns and second to
> handle other cases.

Let's keep it simple and use the same diff format for all tuples, at 
least for now. If it turns out that you can indeed get even more gain 
for fixed length tuples by something like that, then let's do that later 
as a separate patch.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: [WIP] Performance Improvement by reducing WAL for Update Operation

From
Amit Kapila
Date:
From: Heikki Linnakangas [mailto:heikki.linnakangas@enterprisedb.com] 
Sent: Monday, August 06, 2012 2:32 PM
To: Amit Kapila
Cc: 'Bruce Momjian'; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [WIP] Performance Improvement by reducing WAL for
Update Operation
On 06.08.2012 06:10, Amit Kapila wrote:
>> Currently the solution for fixed length columns cannot handle the case of
>> variable length columns and NULLS. The reason is for fixed length columns
>> there is no need of diff technology between old and new tuple, however
for
>> other cases it will be required.
>> For fixed length columns, if we just note the OFFSET, LENGTH, VALUE of
>> changed columns of new tuple in WAL, it will be sufficient to do the
replay
>> of WAL. However to handle other cases we need to use diff mechanism.
>
>> Can we do something like if the changed columns are fixed length and
doesn't
>> contain NULL's, then store [OFFSET, LENGTH, VALUE] format in WAL and for
>> other cases store diff format.
>
>> This has advantage that for Updates containing only fixed length columns
>> don't have to pay penality of doing diff between new and old tuple. Also
we
>> can do the whole work in 2 parts, one for fixed length columns and second
to
>> handle other cases.

> Let's keep it simple and use the same diff format for all tuples, at 
> least for now. If it turns out that you can indeed get even more gain 
> for fixed length tuples by something like that, then let's do that later 
> as a separate patch.

Okay, I shall first try to design and implement the same format for all
tuples
and discuss the results of same with community.

With Regards,
Amit Kapila.