Thread: FW: [WIP] Performance Improvement by reducing WAL for Update Operation

FW: [WIP] Performance Improvement by reducing WAL for Update Operation

From

Amit Kapila

Date:

03 August 2012, 12:13:29

One thing I forgot to mention that for tests I have used pg_prewarm utility to load all the data in shared buffers before start of test.

With Regards,

Amit Kapila.

From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Amit kapila
Sent: Friday, August 03, 2012 5:17 PM
To: pgsql-hackers@postgresql.org
Subject: [HACKERS] [WIP] Performance Improvement by reducing WAL for Update Operation

Problem statement:

-----------------------------------

Reducing wal size for an update operation for performance improvement.

Advantages:

---------------------
1. Observed increase in performance with pgbench when server is running in sync_commit off mode.
a. with pgbench (tpc_b) - 13%
b. with modified pgbench (such that size of modified columns are less than all row) - 83%

2. WAL size is reduced

Design/Impementation:

------------------------------

Currently the change is done only for fixed length columns for simple tables and the tuple should not contain NULLS.

This is a Proof of concept, the design and implementation needs to be changed based on final design required for handling other scenario's

Update operation:
-----------------------------
1. Check for the simple table or not.(No toast, No before update triggers)
2. Works only for not null tuples.
3. Identify the modified columns from the target entry.
4. Based on the modified column list, check for any variable length columns are modified, if so this optimization is not applied.
5. Identify the offset and length for the modified columns and store it as an optimized WAL tuple in the following format.
Note: Wal update header is modified to denote whether wal update optimization is done or not.
WAL update header + Tuple header(no change from previous format) +
[offset(2bytes)] [length(2 bytes)] [changed data value]
[offset(2bytes)] [length(2 bytes)] [changed data value]
....
....

Recovery:

----------------
The following steps are only incase of the tuple is optimized.

6. For forming the new tuple, old tuple is required.(including if the old tuple does not require any modifications also).
7. Form the new tuple based on the format specified in the 5th point.
8. once new tuple is framed, follow the exisiting behavior.

Frame the new tuple from old tuple and WAL record:

1. The length of the data which is needs to be copied from old tuple is calculated as
the difference of offset present in the WAL record and the old tuple offset.
(for the first time, the old tuple offset value is zero)
2. Once the old tuple data copied, then increase the offset for old tuple by the
copied length.
3. Get the length and value of modified column from WAL record, copy it into new tuple.
4. Increase the old tuple offset with the modified column length.
5. Repeat this procedure until the WAL record reaches the end.
6. If any remaining left out old tuple data will be copied.

Test results:

----------------------
1. The pgbench test run for 10min.

2. pgbench result for tpc-b is attached with this mail as pgbench_org

3. modified pgbench(such that size of modified columns are less than all row) result for tpc-b is attached with this mail as pgbench_1800_300

Modified pgbench code:

---------------------------------------
1. Schema of the tables are modified as added some extra fields to increase the record size to 1800.
2. The tcp_b benchmark suite to do only update operations.
3. The update operation changed as to update 3 columns with 300 bytes out of total size of 1800 bytes.
4. During initialization of tables removed the NULL value insertions.

I am working on solution to handle other scenarios like variable length columns, tuple contain NULLs, handling for before triggers.

Please provide suggestions/objections?

With Regards,

Amit Kapila.