Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem - Mailing list pgsql-hackers

From Anastasia Lubennikova
Subject Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
Date
Msg-id c4a728a9-e86e-2f23-1472-c2f380f74727@postgrespro.ru
Whole thread Raw
In response to Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem  (Claudio Freire <klaussfreire@gmail.com>)
Responses Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem  (Claudio Freire <klaussfreire@gmail.com>)
List pgsql-hackers
27.12.2016 20:14, Claudio Freire:
On Tue, Dec 27, 2016 at 10:41 AM, Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000006941e7 in lazy_vacuum_heap (onerel=0x1ec2360,
vacrelstats=0x1ef6e00) at vacuumlazy.c:1417
1417                tblk =
ItemPointerGetBlockNumber(&seg->dead_tuples[tupindex]);
(gdb) bt
#0  0x00000000006941e7 in lazy_vacuum_heap (onerel=0x1ec2360,
vacrelstats=0x1ef6e00) at vacuumlazy.c:1417
#1  0x0000000000693dfe in lazy_scan_heap (onerel=0x1ec2360, options=9,
vacrelstats=0x1ef6e00, Irel=0x1ef7168, nindexes=2, aggressive=1 '\001')   at vacuumlazy.c:1337
#2  0x0000000000691e66 in lazy_vacuum_rel (onerel=0x1ec2360, options=9,
params=0x7ffe0f866310, bstrategy=0x1f1c4a8) at vacuumlazy.c:290
#3  0x000000000069191f in vacuum_rel (relid=1247, relation=0x0, options=9,
params=0x7ffe0f866310) at vacuum.c:1418
Those line numbers don't match my code.

Which commit are you based on?

My tree is (currently) based on 71f996d22125eb6cfdbee6094f44370aa8dec610

Hm, my branch is based on 71f996d22125eb6cfdbee6094f44370aa8dec610 as well.
I merely applied patches 0001-Vacuum-prefetch-buffers-on-backward-scan-v3.patch
and 0002-Vacuum-allow-using-more-than-1GB-work-mem-v3.patch
then ran configure and make as usual.
Am I doing something wrong?

Anyway, I found the problem that had caused segfault.

for (segindex = 0; segindex <= vacrelstats->dead_tuples.last_seg; tupindex = 0, segindex++)
{
    DeadTuplesSegment *seg = &(vacrelstats->dead_tuples.dead_tuples[segindex]);
    int            num_dead_tuples = seg->num_dead_tuples;

    while (tupindex < num_dead_tuples)
    ...

You rely on the value of tupindex here, while during the very first pass the 'tupindex' variable
may contain any garbage. And it happend that on my system there was negative value
as I found inspecting core dump:

(gdb) info locals
num_dead_tuples = 5
tottuples = 0
tupindex = -1819017215

Which leads to failure in the next line
    tblk = ItemPointerGetBlockNumber(&seg->dead_tuples[tupindex]);

The solution is to move this assignment inside the cycle.

I've read the second patch.

1. What is the reason to inline vac_cmp_itemptr() ?

2. +#define LAZY_MIN_TUPLES        Max(MaxHeapTuplesPerPage, (128<<20) / sizeof(ItemPointerData))
What does 128<<20 mean? Why not 1<<27? I'd ask you to replace it with named constant,
or at least add a comment.

I'll share my results of performance testing it in a few days.
-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: [HACKERS] proposal: session server side variables
Next
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] Reporting planning time with EXPLAIN