Database transaction with intermittent slow responses - Mailing list pgsql-performance

From John Gorman
Subject Database transaction with intermittent slow responses
Date
Msg-id 0A1B0B276DEE6441A2E68EB66D4540DAC34BCF20@winex2.eldocomp.com
Whole thread Raw
Responses Re: Database transaction with intermittent slow responses  (Gerardo Herzig <gherzig@fmed.uba.ar>)
List pgsql-performance
Transactions to table, ChangeHistory, have recently become intermittently slow and is increasing becoming slower.
 
* No database configuration changes have been made recently
* We have run vacuum analyze
* We have tried backing up and reloading the table (data, indexes, etc)
 
Some transactions respond quickly (200 ms) and others take over 55 seconds (we cancel the query after 55 seconds – our timeout SLA). The problem has recently become very bad. It is the same query being issued but with different parameters.
 
If the transaction takes over 55 seconds and I run the query manually (with or without EXPLAIN ANALYZE) it returns quickly (a few hundred ms). In case I am looking at cache, I have a list of other queries (just different parameters) that have timed out and when I run them (without the limit even) the response is very timely.
 
Any help or insight would be great.
 
NOTE: our application is connecting to the database via JDBC and we are using PreparedStatements. I have provided full details so all information is available, but please let me know if any other information is needed – thx in advance.
 
p306=> EXPLAIN ANALYZE SELECT * FROM ChangeHistory WHERE Category BETWEEN 'Employee' AND 'Employeezz' AND PrimaryKeyOfChange BETWEEN '312313' AND '312313!zz' ORDER BY ChgTS DESC, ChgUser DESC, Category DESC, PrimaryKeyOfChange DESC LIMIT 11;
                                            QUERY PLAN                                            
------------------------------------------------------------------------------------------------------
Limit  (cost=33.66..33.67 rows=1 width=136) (actual time=0.297..0.297 rows=11 loops=1)
   ->  Sort  (cost=33.66..33.67 rows=1 width=136) (actual time=0.297..0.297 rows=11 loops=1)
         Sort Key: chgts, chguser, category, primarykeyofchange
         Sort Method: top-N heapsort  Memory: 27kB
         ->  Index Scan using changehistory_idx4 on changehistory  (cost=0.00..33.65 rows=1 width=136) (actual time=0.046..
0.239 rows=85 loops=1)
               Index Cond: (((primarykeyofchange)::text >= '312313'::text) AND ((primarykeyofchange)::text <= '312313!zz'::
text))
               Filter: (((category)::text >= 'Employee'::text) AND ((category)::text <= 'Employeezz'::text))
Total runtime: 0.328 ms
(8 rows)
 
>>>
History this week of counts with good response times vs timeouts.
 
| Date       | Success # | Time Out # | Avg. Success Secs |
|------------+-----------+------------+-------------------|
| 2016-05-09 |        18 |         31 |               7.9 |
| 2016-05-10 |        17 |         25 |              10.5 |
| 2016-05-11 |        27 |         33 |              10.1 |
| 2016-05-12 |        68 |         24 |               9.9 |
 
 
>>> Sample transaction response times
 
| Timestamp         | Tran ID        | Resp MS | Resp CD
--------------------+----------------+---------+--------
2016-05-10 06:20:19 | ListChangeHist | 55,023  | TIMEOUT
2016-05-10 07:47:34 | ListChangeHist | 55,017  | TIMEOUT
2016-05-10 07:48:00 | ListChangeHist |  9,866  | OK
2016-05-10 07:48:10 | ListChangeHist |  2,327  | OK
2016-05-10 07:59:23 | ListChangeHist | 55,020  | TIMEOUT
2016-05-10 08:11:20 | ListChangeHist | 55,030  | TIMEOUT
2016-05-10 08:31:45 | ListChangeHist |  4,216  | OK
2016-05-10 08:35:09 | ListChangeHist |  7,898  | OK
2016-05-10 08:36:18 | ListChangeHist |  9,810  | OK
2016-05-10 08:36:56 | ListChangeHist | 55,027  | TIMEOUT
2016-05-10 08:37:33 | ListChangeHist | 46,433  | OK
2016-05-10 08:38:09 | ListChangeHist | 55,019  | TIMEOUT
2016-05-10 08:53:43 | ListChangeHist | 55,019  | TIMEOUT
2016-05-10 09:45:09 | ListChangeHist | 55,022  | TIMEOUT
2016-05-10 09:46:13 | ListChangeHist | 55,017  | TIMEOUT
2016-05-10 09:49:27 | ListChangeHist | 55,011  | TIMEOUT
2016-05-10 09:52:12 | ListChangeHist | 55,018  | TIMEOUT
2016-05-10 09:57:42 | ListChangeHist |  9,462  | OK
2016-05-10 10:05:21 | ListChangeHist | 55,016  | TIMEOUT
2016-05-10 10:05:29 | ListChangeHist |    136  | OK
2016-05-10 10:05:38 | ListChangeHist |  1,517  | OK
 
Artifacts
======================
 
$ >uname -a
SunOS ***** 5.10 Generic_150400-30 sun4v sparc sun4v
 
Memory    : 254G phys mem, 207G free mem.
Processors: 32 - CPU is mostly 80% free
 
>>>
p306=> select version();
                                                      version                                                     
---------------------------------------------------------------------------------------------------
PostgreSQL 9.1.14 on sparc-sun-solaris2.10, compiled by gcc (GCC) 3.4.3 (csl-sol210-3_4-branch+sol_rpath), 64-bit
 
>>>
p306=> \dt+ changehistory
                      List of relations
Schema |     Name      | Type  | Owner | Size  | Description
--------+---------------+-------+-------+-------+-------------
public | changehistory | table | p306  | 17 GB |
 
>>>
p306=> \di+ changehistory*
                                   List of relations
Schema |         Name          | Type  | Owner |     Table     |  Size   | Description
--------+-----------------------+-------+-------+---------------+---------+-------------
public | changehistory_idx1    | index | p306  | changehistory | 9597 MB |
public | changehistory_idx3    | index | p306  | changehistory | 11 GB   |
public | changehistory_idx4    | index | p306  | changehistory | 4973 MB |
public | changehistory_pkey    | index | p306  | changehistory | 2791 MB |
public | changehistory_search2 | index | p306  | changehistory | 9888 MB |
public | changehistory_search3 | index | p306  | changehistory | 10 GB   |
public | changehistory_search4 | index | p306  | changehistory | 9240 MB |
public | changehistory_search5 | index | p306  | changehistory | 8373 MB |
(8 rows)
 
 
>>>
p306=> select count(*) from changehistory ;
   count  
------------
129,185,024
 
>>>
Show all (filtered)
======================================================
 
              name               | setting
---------------------------------+--------------------
autovacuum                      | on
autovacuum_analyze_scale_factor | 0.001
autovacuum_analyze_threshold    | 500
autovacuum_freeze_max_age       | 200000000
autovacuum_max_workers          | 5
autovacuum_naptime              | 1min
autovacuum_vacuum_cost_delay    | 0
autovacuum_vacuum_cost_limit    | -1
autovacuum_vacuum_scale_factor  | 0.001
autovacuum_vacuum_threshold     | 500
bgwriter_delay                  | 200ms
block_size                      | 8192
check_function_bodies           | on
checkpoint_completion_target    | 0.9
checkpoint_segments             | 256
checkpoint_timeout              | 1h
checkpoint_warning              | 30s
client_encoding                 | UTF8
commit_delay                    | 0
commit_siblings                 | 5
cpu_index_tuple_cost            | 0.005
cpu_operator_cost               | 0.0025
cpu_tuple_cost                  | 0.01
cursor_tuple_fraction           | 0.1
deadlock_timeout                | 1s
default_statistics_target       | 100
default_transaction_deferrable  | off
default_transaction_isolation   | read committed
default_transaction_read_only   | off
default_with_oids               | off
effective_cache_size            | 8GB
from_collapse_limit             | 8
fsync                           | on
full_page_writes                | on
ignore_system_indexes           | off
join_collapse_limit             | 8
krb_caseins_users               | off
lo_compat_privileges            | off
maintenance_work_mem            | 1GB
max_connections                 | 350
max_files_per_process           | 1000
max_function_args               | 100
max_identifier_length           | 63
max_index_keys                  | 32
max_locks_per_transaction       | 64
max_pred_locks_per_transaction  | 64
max_prepared_transactions       | 0
max_stack_depth                 | 2MB
max_wal_senders                 | 5
random_page_cost                | 4
segment_size                    | 1GB
seq_page_cost                   | 1
server_encoding                 | UTF8
server_version                  | 9.1.14
shared_buffers                  | 2GB
sql_inheritance                 | on
statement_timeout               | 0
synchronize_seqscans            | on
synchronous_commit              | on
synchronous_standby_names       |
tcp_keepalives_count            | 0
tcp_keepalives_idle             | -1
tcp_keepalives_interval         | 0
track_activities                | on
track_activity_query_size       | 1024
track_counts                    | on
track_functions                 | none
transaction_deferrable          | off
transaction_isolation           | read committed
transaction_read_only           | off
transform_null_equals           | off
update_process_title            | on
vacuum_cost_delay               | 0
vacuum_cost_limit               | 200
vacuum_cost_page_dirty          | 20
vacuum_cost_page_hit            | 1
vacuum_cost_page_miss           | 10
vacuum_defer_cleanup_age        | 0
vacuum_freeze_min_age           | 50000000
vacuum_freeze_table_age         | 150000000
 
John Gorman | Manager of Production Support, Architecture, Release Engineering | Eldorado | a Division of MPHASIS | www.eldoinc.com/ |
5353 North 16th Street, Suite 400, Phoenix, Arizona 85016-3228 | Tel 602.604.3100 | Fax: 602.604.3115
 
 
 

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: LIKE pattern
Next
From: Gerardo Herzig
Date:
Subject: Re: Database transaction with intermittent slow responses