Thread: database crash during pgbench run

database crash during pgbench run

From
Mariel Cherkassky
Date:
Hi,
I have a very strong machine with 64GB of ram and 19 cpu but it seems that when I'm running the next benchmark test with pg_bench the database is crashing : 

createdb -U postgres bench
pgbench -i -s 50 -U postgres -d bench
pgbench -U postgres -d bench -c 10 -t 10000

output : 

client 8 receiving
FATAL:  terminating connection due to administrator command
client 8 sending UPDATE pgbench_accounts SET abalance = abalance + -1542 WHERE aid = 1142155;
client 8 could not send UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
invalid socket: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 50
query mode: simple
number of clients: 10
number of threads: 1
number of transactions per client: 10000
number of transactions actually processed: 89241/100000
latency average = 27.944 ms
tps = 357.864437 (including connections establishing)
tps = 357.871594 (excluding connections establishing)


it crashes after some time and not immediately.

output from logs :

2018-12-10 19:11:12 IST  505  LOG:  automatic analyze of table "bench.public.pgbench_branches" system usage: CPU 0.00s/0.00u sec elapsed 0.01 sec
2018-12-10 19:11:12 IST  505  LOG:  automatic analyze of table "bench.public.pgbench_history" system usage: CPU 0.00s/0.05u sec elapsed 0.10 sec
2018-12-10 19:11:14 IST bench 25045  LOG:  duration: 1451.819 ms  statement: UPDATE pgbench_branches SET bbalance = bbalance + -4059 WHERE bid = 14;
2018-12-10 19:11:40 IST bench 25049  LOG:  duration: 1039.710 ms  statement: UPDATE pgbench_tellers SET tbalance = tbalance + 3596 WHERE tid = 403;
2018-12-10 19:11:56 IST  23647  LOG:  received fast shutdown request
2018-12-10 19:11:56 IST  23647  LOG:  aborting any active transactions
2018-12-10 19:11:56 IST bench 25051  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST bench 25049  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST sadas 27765  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST bench 25050  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST  23654  LOG:  autovacuum launcher shutting down
2018-12-10 19:11:56 IST  sadas   24821  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST hadr 24814  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST bench 25047  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST bench 25048  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST hadr 24065  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST postgres 24818  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST postgres 24819  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST hadr 24812  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST postgres 24817  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST bench 25046  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST hadr 24813  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST postgres 24816  FATAL:  terminating connection due to administrator command
2018-12-10 19:11:56 IST  23651  LOG:  shutting down
2018-12-10 19:11:56 IST  23651  LOG:  checkpoint starting: shutdown immediate
2018-12-10 19:11:59 IST  23651  LOG:  checkpoint complete: wrote 69557 buffers (4.1%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=2.800 s, sync=0.045 s, total=2.877 s; sync files=23, longest=0.045 s, average=0.001 s; distance=573364 kB, estimate=573364 kB
2018-12-10 19:11:59 IST  23647  LOG:  database system is shut down
2018-12-10 19:12:11 IST  2641  LOG:  database system was shut down at 2018-12-10 19:11:59 IST
2018-12-10 19:12:11 IST  2641  LOG:  MultiXact member wraparound protections are now enabled
2018-12-10 19:12:11 IST  2638  LOG:  database system is ready to accept connections
2018-12-10 19:12:11 IST  2645  LOG:  autovacuum launcher started
2018-12-10 19:12:17 IST  2692  LOG:  automatic vacuum of table "bench.public.pgbench_tellers": index scans: 0
        pages: 0 removed, 13 remain, 0 skipped due to pins, 0 skipped frozen

some conf parameters : 
listen_addresses = '*'
maintenance_work_mem = 128MB
work_mem = 53MB
shared_buffers = 13411MB
effective_cache_size = 32278MB
max_wal_size = 1440MB
wal_buffers = 16MB
checkpoint_completion_target = 0.9
standard_conforming_strings = off
max_locks_per_transaction = 5000
max_connections = 1200
checkpoint_timeout = 30min
random_page_cost = 2.0


any idea what can cause it ?

Re: database crash during pgbench run

From
Tom Lane
Date:
Mariel Cherkassky <mariel.cherkassky@gmail.com> writes:
> 2018-12-10 19:11:56 IST  23647  LOG:  received fast shutdown request

> any idea what can cause it ?

Something sent SIGINT to the postmaster.

            regards, tom lane


RE: database crash during pgbench run

From
Greg Clough
Date:
> > 2018-12-10 19:11:56 IST  23647  LOG:  received fast shutdown request
> > any idea what can cause it ?
>
>Something sent SIGINT to the postmaster.

My money is on the OoM (Out of Memory) killer.  The standard PDGD install on CentOS should disable that, but I'm not
surewhat OS you're on or how PostgreSQL was installed. 

Greg Clough.








________________________________

This e-mail, including accompanying communications and attachments, is strictly confidential and only for the intended
recipient.Any retention, use or disclosure not expressly authorised by IHSMarkit is prohibited. This email is subject
toall waivers and other terms at the following link: https://ihsmarkit.com/Legal/EmailDisclaimer.html 

Please visit www.ihsmarkit.com/about/contact-us.html for contact information on our offices worldwide.


Re: database crash during pgbench run

From
Tom Lane
Date:
Greg Clough <Greg.Clough@ihsmarkit.com> writes:
>>> 2018-12-10 19:11:56 IST  23647  LOG:  received fast shutdown request
>>> any idea what can cause it ?

>> Something sent SIGINT to the postmaster.

> My money is on the OoM (Out of Memory) killer.

That usually uses SIGKILL.  If I had to guess, I'd wonder whether the
postmaster was manually started, and if so whether it was properly
dissociated from the user's terminal (with nohup or the like).
If it wasn't, then a control-C typed at the terminal would SIGINT the
postmaster as well as whatever it was meant to terminate.

            regards, tom lane


Re: database crash during pgbench run

From
Merlin Moncure
Date:
On Tue, Dec 11, 2018 at 10:01 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Greg Clough <Greg.Clough@ihsmarkit.com> writes:
> >>> 2018-12-10 19:11:56 IST  23647  LOG:  received fast shutdown request
> >>> any idea what can cause it ?
>
> >> Something sent SIGINT to the postmaster.
>
> > My money is on the OoM (Out of Memory) killer.
>
> That usually uses SIGKILL.  If I had to guess, I'd wonder whether the
> postmaster was manually started, and if so whether it was properly
> dissociated from the user's terminal (with nohup or the like).
> If it wasn't, then a control-C typed at the terminal would SIGINT the
> postmaster as well as whatever it was meant to terminate.

Yeah.  To add to this, pgbench runs are extremely unlikely to cause
the kind of memory consumption issues that would trigger an OOM.  This
is definitely not a database crash, just some kind of administrative
problem.  Some things that might be helpful to help figure this out:
*) What o/s
*) how was the database installed
*) how exactly did the database start
*) are we looking at something exotic here (cloud managed postgres,
exotic storage, etc)

merlin