Re: [BUGS] BUG #14758: Segfault with logical replication on afunction index - Mailing list pgsql-bugs
From | Masahiko Sawada |
---|---|
Subject | Re: [BUGS] BUG #14758: Segfault with logical replication on afunction index |
Date | |
Msg-id | CAD21AoCHutPoUdtZg2RHDneXEFnbL3zp+FcQkMrZ8Lrfa3kZLQ@mail.gmail.com Whole thread Raw |
In response to | [BUGS] BUG #14758: Segfault with logical replication on a function index (scott@deltaex.com) |
Responses |
Re: [BUGS] BUG #14758: Segfault with logical replication on afunction index
Re: [BUGS] BUG #14758: Segfault with logical replication on afunction index |
List | pgsql-bugs |
On Mon, Jul 24, 2017 at 4:22 PM, <scott@deltaex.com> wrote: > The following bug has been logged on the website: > > Bug reference: 14758 > Logged by: Scott Milliken > Email address: scott@deltaex.com > PostgreSQL version: 10beta2 > Operating system: Linux 4.10.0-27-generic #30~16.04.2-Ubuntu S > Description: > > I'm testing logical replication on 10beta2, and found a segfault that I can > reliably reproduce with an index on a not-actually immutable function. > > Here's the function in question: > > ``` > CREATE OR REPLACE FUNCTION public.immutable_random(integer) > RETURNS double precision > LANGUAGE sql > IMMUTABLE > AS $function$SELECT random(); > $function$; > ``` > > It's not actually immutable since it's calling random (a hack to get an > efficient random sort on a table). > > (Aside: I'd understand if it errored on creation of the index, but would > really prefer to keep using this instead of tablesample because it's fast, > deterministic, and doesn't have sampling biases like the SYSTEM sampling.) > > > Here's full reproduction instructions: > > > Primary: > ``` > mkdir -p /tmp/test-seg0 > PGPORT=5301 initdb -D /tmp/test-seg0 > echo "wal_level = logical" >> /tmp/test-seg0/postgresql.conf > PGPORT=5301 pg_ctl -D /tmp/test-seg0 start > for (( ; ; )); do if pg_isready -d postgres -p 5301; then break; fi; sleep > 1; done > psql -p 5301 postgres -c "CREATE USER test WITH PASSWORD 'test' SUPERUSER > CREATEDB CREATEROLE LOGIN REPLICATION BYPASSRLS;" > createdb -p 5301 -E utf8 test > > psql -p 5301 -U test test -c "CREATE TABLE testtbl (id int, name text);" > psql -p 5301 -U test test -c "ALTER TABLE testtbl ADD CONSTRAINT > testtbl_pkey PRIMARY KEY (id);" > psql -p 5301 -U test test -c "CREATE PUBLICATION testpub FOR TABLE > testtbl;" > psql -p 5301 -U test test -c "INSERT INTO testtbl (id, name) VALUES (1, > 'a');" > ``` > > Secondary: > ``` > mkdir -p /tmp/test-seg1 > PGPORT=5302 initdb -D /tmp/test-seg1 > PGPORT=5302 pg_ctl -D /tmp/test-seg1 start > for (( ; ; )); do if pg_isready -d postgres -p 5302; then break; fi; sleep > 1; done > psql -p 5302 postgres -c "CREATE USER test WITH PASSWORD 'test' SUPERUSER > CREATEDB CREATEROLE LOGIN REPLICATION BYPASSRLS;" > createdb -p 5302 -E utf8 test > > psql -p 5302 -U test test -c "CREATE TABLE testtbl (id int, name text);" > psql -p 5302 -U test test -c "ALTER TABLE testtbl ADD CONSTRAINT > testtbl_pkey PRIMARY KEY (id);" > psql -p 5302 -U test test -c 'CREATE FUNCTION > public.immutable_random(integer) RETURNS double precision LANGUAGE sql > IMMUTABLE AS $function$ SELECT random(); $function$' > psql -p 5302 -U test test -c "CREATE INDEX ix_testtbl_random ON testtbl > USING btree (immutable_random(id));" > psql -p 5302 -U test test -c "CREATE SUBSCRIPTION test0_testpub CONNECTION > 'port=5301 user=test dbname=test' PUBLICATION testpub;" > ``` > > The secondary crashes with a segfault: > > ``` > 2017-07-23 23:55:37.961 PDT [4823] LOG: logical replication table > synchronization worker for subscription "test0_testpub", table "testtbl" > has started > 2017-07-23 23:55:38.244 PDT [4758] LOG: worker process: logical replication > worker for subscription 16396 sync 16386 (PID 4823) was terminated by signal > 11: Segmentation fault > 2017-07-23 23:55:38.244 PDT [4758] LOG: terminating any other active server > processes > 2017-07-23 23:55:38.245 PDT [4763] WARNING: terminating connection because > of crash of another server process > 2017-07-23 23:55:38.245 PDT [4763] DETAIL: The postmaster has commanded > this server process to roll back the current transaction and exit, because > another server process exited > abnormally and possibly corrupted shared memory. > 2017-07-23 23:55:38.245 PDT [4763] HINT: In a moment you should be able to > reconnect to the database and repeat your command. > 2017-07-23 23:55:38.247 PDT [4758] LOG: all server processes terminated; > reinitializing > 2017-07-23 23:55:38.256 PDT [4826] LOG: database system was interrupted; > last known up at 2017-07-23 23:55:36 PDT > 2017-07-23 23:55:38.809 PDT [4826] LOG: database system was not properly > shut down; automatic recovery in progress > 2017-07-23 23:55:38.812 PDT [4826] LOG: redo starts at 0/173AEA0 > 2017-07-23 23:55:38.815 PDT [4826] LOG: invalid record length at 0/17B50B0: > wanted 24, got 0 > 2017-07-23 23:55:38.815 PDT [4826] LOG: redo done at 0/17B5070 > 2017-07-23 23:55:38.815 PDT [4826] LOG: last completed transaction was at > log time 2017-07-23 23:55:37.962957-07 > ``` > Thank you for the reporting and precise reproducing steps! I could reproduced this issue and it seems to me that the cause of this is that the table sync worker didn't get a snapshot before starting table copy. Attached patch fixes this problem. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Attachment
pgsql-bugs by date: