Thread: [BUGS] BUG #14668: BRIN open autosummarize=on , database will crash

[BUGS] BUG #14668: BRIN open autosummarize=on , database will crash

From
digoal@126.com
Date:
The following bug has been logged on the website:

Bug reference:      14668
Logged by:          Zhou Digoal
Email address:      digoal@126.com
PostgreSQL version: 10beta1
Operating system:   CentOS 6.x x64
Description:

HI, when i test brin index, and set autosummarize=on, insert data will crash
database.

```
postgres=# create table test(id serial8, c1 int, c2 int);
CREATE TABLE
postgres=# create index idx_test_1 on test using brin(id) with
(pages_per_range=1,autosummarize=on);
CREATE INDEX

vi test.sql
\set c1 random(1,10000)
\set c2 random(1,1000000)
insert into test (c1,c2) values (:c1, :c2);

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 32 -j 32 -T 100
```

then PostgreSQL crash, 

log

```
,,0,LOG,00000,"server process (PID 38060) was terminated by signal 11:
Segmentation fault","Failed process was running: insert into test (c1,c2)
values ($1, $2);",,,,,,,"LogChildExit, postmaster.c:3553",""
,,0,LOG,00000,"terminating any other active server
processes",,,,,,,,"HandleChildCrash, postmaster.c:3273",""
,71/2,0,WARNING,57P02,"terminating connection because of crash of another
server process","The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.","In a moment you should be
able to reconnect to the database and repeat your command.",,,,,,"quickdie,
postgres.c:2595",""
,37/2,0,WARNING,57P02,"terminating connection because of crash of another
server process","The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.","In a moment you should be
able to reconnect to the database and repeat your command.",,,,,,"quickdie,
postgres.c:2595",""
,70/2,0,WARNING,57P02,"terminating connection because of crash of another
server process","The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.","In a moment you should be
able to reconnect to the database and repeat your command.",,,,,,"quickdie,
postgres.c:2595",""
,69/2,0,WARNING,57P02,"terminating connection because of crash of another
server process","The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.","In a moment you should be
able to reconnect to the database and repeat your command.",,,,,,"quickdie,
postgres.c:2595",""
,1/0,0,WARNING,57P02,"terminating connection because of crash of another
server process","The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.","In a moment you should be
able to reconnect to the database and repeat your command.",,,,,,"quickdie,
postgres.c:2595",""
,35/2,0,WARNING,57P02,"terminating connection because of crash of another
server process","The postmaster has commanded this server process to roll
back the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.","In a moment you should be
able to reconnect to the database and repeat your command.",,,,,,"quickdie,
postgres.c:2595",""
,,0,LOG,00000,"all server processes terminated;
reinitializing",,,,,,,,"PostmasterStateMachine, postmaster.c:3800",""
,,0,LOG,00000,"database system was interrupted; last known up at 2017-05-24
14:22:26 CST",,,,,,,,"StartupXLOG, xlog.c:6256",""
,,0,LOG,00000,"database system was not properly shut down; automatic
recovery in progress",,,,,,,,"StartupXLOG, xlog.c:6759",""
,,0,LOG,00000,"redo starts at 0/408E7140",,,,,,,,"StartupXLOG,
xlog.c:7014",""
,,0,LOG,00000,"invalid record length at 0/4090F028: wanted 24, got
0",,,,,,,,"ReadRecord, xlog.c:4184",""
,,0,LOG,00000,"redo done at 0/4090EEF0",,,,,,,,"StartupXLOG,
xlog.c:7286",""
,,0,LOG,00000,"last completed transaction was at log time 2017-05-24
14:24:49.118091+08",,,,,,,,"StartupXLOG, xlog.c:7291",""
,,0,LOG,00000,"checkpoint starting: end-of-recovery
immediate",,,,,,,,"LogCheckpointStart, xlog.c:8369",""
,,0,LOG,00000,"checkpoint complete: wrote 117 buffers (0.0%); 0 WAL file(s)
added, 0 removed, 0 recycled; write=0.082 s, sync=0.016 s, total=0.104 s;
sync files=38, longest=0.004 s, average=0.000 s; distance=159 kB,
estimate=159 kB",,,,,,,,"LogCheckpointEnd, xlog.c:8451",""
,,0,LOG,00000,"database system is ready to accept
connections",,,,,,,,"reaper, postmaster.c:2866",""
```

core file

```
(gdb) bt
#0  0x00000000008fc1c3 in yy_transition ()
Cannot access memory at address 0x7fff048a5788
```

best regards,
digoal


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14668: BRIN open autosummarize=on , database will crash

From
Thomas Munro
Date:
On Wed, May 24, 2017 at 6:33 PM,  <digoal@126.com> wrote:
> postgres=# create table test(id serial8, c1 int, c2 int);
> CREATE TABLE
> postgres=# create index idx_test_1 on test using brin(id) with
> (pages_per_range=1,autosummarize=on);
> CREATE INDEX
>
> vi test.sql
> \set c1 random(1,10000)
> \set c2 random(1,1000000)
> insert into test (c1,c2) values (:c1, :c2);
>
> pgbench -M prepared -n -r -P 1 -f ./test.sql -c 32 -j 32 -T 100
> ```
>
> then PostgreSQL crash,

Reproduced here.
   frame #3: 0x000000010ac2d6f0
postgres`ExceptionalCondition(conditionName="!(pointer !=
((void*)0))", errorType="FailedAssertion",
fileName="../../../../src/include/utils/memutils.h", lineNumber=116) +
128 at assert.c:54   frame #4: 0x000000010ac6f856
postgres`GetMemoryChunkContext(pointer=0x0000000000000000) + 54 at
memutils.h:116   frame #5: 0x000000010ac6f725
postgres`pfree(pointer=0x0000000000000000) + 21 at mcxt.c:952   frame #6: 0x000000010a5cabd5
postgres`brin_free_tuple(tuple=0x0000000000000000) + 21 at
brin_tuple.c:310   frame #7: 0x000000010a5c2b88
postgres`brininsert(idxRel=0x000000010b190638,
values=0x00007fff5563dbd0, nulls="", heaptid=0x00007fcb67801b8c,
heapRel=0x000000010b18b1d0, checkUnique=UNIQUE_CHECK_NO,
indexInfo=0x00007fcb67800aa0) + 680 at brin.c:193

I guess brin_free_tuple(lastPageTuple) should only be called if it's
not NULL, so I guess brin.c lacks an "else" here:

-                       brin_free_tuple(lastPageTuple);
+                       else
+                               brin_free_tuple(lastPageTuple);

It doesn't crash for me with that change.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14668: BRIN open autosummarize=on , database willcrash

From
Alvaro Herrera
Date:
Thomas Munro wrote:

> I guess brin_free_tuple(lastPageTuple) should only be called if it's
> not NULL, so I guess brin.c lacks an "else" here:
> 
> -                       brin_free_tuple(lastPageTuple);
> +                       else
> +                               brin_free_tuple(lastPageTuple);
> 
> It doesn't crash for me with that change.

Pushed fix.  Actually that's not correct either, because tuples returned
by brinGetTupleForHeapBlock are not supposed to be freed at all since
they are shared buffer items.  The correct thing to do there was to
release the buffer lock ...

Thanks for the report and analysis.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14668: BRIN open autosummarize=on , database will crash

From
Thomas Munro
Date:
On Wed, May 31, 2017 at 10:19 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Pushed fix.  Actually that's not correct either, because tuples returned
> by brinGetTupleForHeapBlock are not supposed to be freed at all since
> they are shared buffer items.  The correct thing to do there was to
> release the buffer lock ...

Ugh, right, thanks.  I'll blame the myopic analysis on jetlag and lack
of coffee.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs