Re: Freeze avoidance of very large table. - Mailing list pgsql-hackers

From Sawada Masahiko
Subject Re: Freeze avoidance of very large table.
Date
Msg-id CAD21AoCwt7szDLPnLEyjz+xq0GXsnrmfvPA0Mr-OUdLa+OmUDw@mail.gmail.com
Whole thread Raw
In response to Re: Freeze avoidance of very large table.  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: Freeze avoidance of very large table.  (Simon Riggs <simon@2ndQuadrant.com>)
Re: Freeze avoidance of very large table.  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Thu, Jul 2, 2015 at 1:06 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Thu, Jul 2, 2015 at 12:13 AM, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
>> On Thu, May 28, 2015 at 11:34 AM, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
>>> On Thu, Apr 30, 2015 at 8:07 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
>>>> On Fri, Apr 24, 2015 at 11:21 AM, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
>>>>> On Fri, Apr 24, 2015 at 1:31 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
>>>>>> On 4/23/15 11:06 AM, Petr Jelinek wrote:
>>>>>>>
>>>>>>> On 23/04/15 17:45, Bruce Momjian wrote:
>>>>>>>>
>>>>>>>> On Thu, Apr 23, 2015 at 09:45:38AM -0400, Robert Haas wrote:
>>>>>>>> Agreed, no extra file, and the same write volume as currently.  It would
>>>>>>>> also match pg_clog, which uses two bits per transaction --- maybe we can
>>>>>>>> reuse some of that code.
>>>>>>>>
>>>>>>>
>>>>>>> Yeah, this approach seems promising. We probably can't reuse code from
>>>>>>> clog because the usage pattern is different (key for clog is xid, while
>>>>>>> for visibility/freeze map ctid is used). But visibility map storage
>>>>>>> layer is pretty simple so it should be easy to extend it for this use.
>>>>>>
>>>>>>
>>>>>> Actually, there may be some bit manipulation functions we could reuse;
>>>>>> things like efficiently counting how many things in a byte are set. Probably
>>>>>> doesn't make sense to fully refactor it, but at least CLOG is a good source
>>>>>> for cut/paste/whack.
>>>>>>
>>>>>
>>>>> I agree with adding a bit that indicates corresponding page is
>>>>> all-frozen into VM, just like CLOG.
>>>>> I'll change the patch as second version patch.
>>>>>
>>>>
>>>> The second patch is attached.
>>>>
>>>> In second patch, I added a bit that indicates all tuples in page are
>>>> completely frozen into visibility map.
>>>> The visibility map became a bitmap with two bit per heap page:
>>>> all-visible and all-frozen.
>>>> The logics around vacuum, insert/update/delete heap are almost same as
>>>> previous version.
>>>>
>>>> This patch lack some point: documentation, comment in source code,
>>>> etc, so it's WIP patch yet,
>>>> but I think that it's enough to discuss about this.
>>>>
>>>
>>> The previous patch is no longer applied cleanly to HEAD.
>>> The attached v2 patch is latest version.
>>>
>>> Please review it.
>>
>> Attached new rebased version patch.
>> Please give me comments!
>
> Now we should review your design and approach rather than code,
> but since I got an assertion error while trying the patch, I report it.
>
> "initdb -D test -k" caused the following assertion failure.
>
> vacuuming database template1 ... TRAP:
> FailedAssertion("!((((PageHeader) (heapPage))->pd_flags & 0x0004))",
> File: "visibilitymap.c", Line: 328)
> sh: line 1: 83785 Abort trap: 6
> "/dav/000_add_frozen_bit_into_visibilitymap_v3/bin/postgres" --single
> -F -O -c search_path=pg_catalog -c exit_on_error=true template1 >
> /dev/null
> child process exited with exit code 134
> initdb: removing data directory "test"

Thank you for bug report, and comments.

Fixed version is attached, and source code comment is also updated.
Please review it.

And I explain again here about what this patch does, current design.

- A additional bit for visibility map.
I added additional bit, say all-frozen bit, which indicates whether
the all pages of corresponding page are frozen, to visibility map.
This structure is similar to CLOG.
So the size of VM grew as twice as today.
Also, the flags of each heap page header might be set PD_ALL_FROZEN,
as well as all-visible

- Set and clear a all-frozen bit
Update and delete and insert(multi insert) operation would clear a bit
of that page, and clear flags of page header at same time.
Only vauum operation can set a bit if all tuple of a page are frozen.

- Anti-wrapping vacuum
We have to scan whole table for XID anti-warring today, and it's
really quite expensive because disk I/O.
The main benefit of this proposal is to reduce and avoid such
extremely large quantity I/O even when anti-wrapping vacuum is
executed.
We have to scan whole table for XID anti-warring today, and it's
really quite expensive.
In lazy_scan_heap() function, I added a such logic for experimental.

There were several another idea on previous discussion such as
read-only table, frozen map. But advantage of this direction is that
we don't need additional heap file, and can use the matured VM
mechanism.

Regards,

--
Sawada Masahiko

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: raw output from copy
Next
From: Peter Eisentraut
Date:
Subject: Re: Information of pg_stat_ssl visible to all users