Crash and core on 10.1 and 10.2 - Mailing list pgsql-general

From Kelly Burkhart
Subject Crash and core on 10.1 and 10.2
Date
Msg-id CAND8VyANL0kqeOvfbQ=zp+wHDvnJ55QnmsebAcvbepHq-iZnHg@mail.gmail.com
Whole thread Raw
Responses Re: Crash and core on 10.1 and 10.2  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hello, I've had two core dumps in the last couple of weeks.  The most recent, yesterday was on version 10.2:

(gdb) bt
#0  0x00007f317a043886 in get_next_seq () from /lib64/libc.so.6
#1  0x00007f317a044acc in strcoll_l () from /lib64/libc.so.6
#2  0x00000000007ced5f in varstrfastcmp_locale ()
#3  0x000000000081b6fb in qsort_ssup ()
#4  0x000000000081d8e1 in tuplesort_performsort ()
#5  0x00000000005eaf00 in finalize_aggregates ()
#6  0x00000000005ebe42 in ExecAgg ()
#7  0x00000000005e30e8 in ExecProcNodeInstr ()
#8  0x00000000005ded12 in standard_ExecutorRun ()
#9  0x0000000000589380 in ExplainOnePlan ()
#10 0x0000000000589667 in ExplainOneQuery ()
#11 0x0000000000589b74 in ExplainQuery ()
#12 0x000000000070036b in standard_ProcessUtility ()
#13 0x00000000006fd9f7 in PortalRunUtility ()
#14 0x00000000006fe702 in FillPortalStore ()
#15 0x00000000006ff110 in PortalRun ()
#16 0x00000000006fb163 in exec_simple_query ()
#17 0x00000000006fc41c in PostgresMain ()
#18 0x0000000000475c7d in ServerLoop ()
#19 0x0000000000697449 in PostmasterMain ()
#20 0x0000000000476691 in main ()

The earlier was last week on 10.1:

(gdb) bt
#0  0x00007f6e1f09d8ea in get_next_seq () from /lib64/libc.so.6
#1  0x00007f6e1f09eacc in strcoll_l () from /lib64/libc.so.6
#2  0x00000000007cf70b in varstr_cmp ()
#3  0x000000000075f25b in compareJsonbContainers ()
#4  0x000000000075d8f2 in jsonb_eq ()
#5  0x00000000005db2bc in ExecInterpExpr ()
#6  0x00000000005e3b09 in ExecScan ()
#7  0x00000000005de352 in standard_ExecutorRun ()
#8  0x00000000005e262a in ParallelQueryMain ()
#9  0x00000000004db981 in ParallelWorkerMain ()
#10 0x00000000006895ff in StartBackgroundWorker ()
#11 0x000000000069470d in maybe_start_bgworkers ()
#12 0x0000000000695245 in sigusr1_handler ()
#13 <signal handler called>
#14 0x00007f6e1f0f9783 in __select_nocancel () from /lib64/libc.so.6
#15 0x0000000000475327 in ServerLoop ()
#16 0x0000000000696409 in PostmasterMain ()
#17 0x0000000000476671 in main ()


These are running on a centos 7 host, dell r640.  My modifications to the default postgresql.conf are:

max_connections = 1000
shared_buffers = 12GB
work_mem = 1GB
maintenance_work_mem = 1GB
constraint_exclusion = partition
track_activities = on
track_counts = on
track_functions = all
track_io_timing = on
autovacuum = on
datestyle = 'iso, ymd'
log_destination = 'csvlog'
logging_collector = on
log_directory = '/opt/pgdata/dblog'
log_filename = 'postgresql-%w.log'
log_file_mode = 0644
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
log_min_duration_statement = 0
log_checkpoints = on
log_connections = on
log_disconnections = on
log_duration = on
log_hostname = off
log_line_prefix = '#%r:%p:%m:%i# '
log_lock_Waits = on
log_statement = 'none'
log_temp_files = 0
update_process_title = on
temp_tablespaces = 'tb_temp'
idle_in_transaction_session_timeout = 300000     # in milliseconds, 0 is disabled


The database is quite busy at the time of the crash with 99% of statements being very simple.  Each time the crash happened during an ad-hoc query of a table with a jsonb field and a gin index on that field.  The most recent query:

server process (PID 318386) was terminated by signal 11: Segmentation fault  Failed process was running: explain analyze select count(distinct other_keys->>'shard_dt') from krb_dataset_shard where dataset_id=22 and other_keys->>'symbol' = 'AAPL';

The table looks like this:

                           Table "tb_us.krb_dataset_shard"
   Column   |  Type  |                           Modifiers                            
------------+--------+----------------------------------------------------------------
 id         | bigint | not null default nextval('krb_dataset_shard_id_seq'::regclass)
 dataset_id | bigint | not null
 other_keys | jsonb  | not null default '{}'::jsonb
Indexes:
    "krb_dataset_shard_pkey" PRIMARY KEY, btree (id), tablespace "tb_idx_tablespace"
    "krb_dataset_shard_ak1" btree (dataset_id), tablespace "tb_idx_tablespace"
    "krb_dataset_shard_ak2" gin (other_keys), tablespace "tb_idx_tablespace"
Foreign-key constraints:
    "dataset_id_fk" FOREIGN KEY (dataset_id) REFERENCES dataset(id)


I do not have the query from the previous crash.

Please let me know if there is any other information that I can provide.

Thanks,

-Kelly



pgsql-general by date:

Previous
From: Francisco Olarte
Date:
Subject: Re: Trigger to create string to inverse SQL statement
Next
From: Tom Lane
Date:
Subject: Re: Crash and core on 10.1 and 10.2