Home > mailing lists

Re: Query performance with min and filter - Mailing list pgsql-general

From	Sengottuvelusamy, Karthikeyan
Subject	Re: Query performance with min and filter
Date	August 31, 2020 13:37:04
Msg-id	VI1PR03MB4397EB36020E64882F655AA889510@VI1PR03MB4397.eurprd03.prod.outlook.com Whole thread Raw
In response to	Query performance with min and filter ("Yorwerth, Adam" <Adam.Yorwerth@tesco.com>)
List	pgsql-general

Tree view

Hi Adam,

What're the query times when you run the query directly on the psql prompt, but without explain/analyze?

Can you check the cache hit rate vs disk read on explain analyze vs from java? Sometimes, the data's in RAM when you run a query manually, but the live Java app might be hitting the disk, causing the slowdown. Are you also sure that the time taken by psql for the query when called from the java app is being measured and not the network delays etc?

Regards,

Karthik.

HoSD, In-Store Platform

Ph: +91 99406 59452

From: Yorwerth, Adam <Adam.Yorwerth@tesco.com>
Sent: Monday, August 31, 2020 2:57 PM
To: pgsql-general@lists.postgresql.org <pgsql-general@lists.postgresql.org>
Cc: Cianciaruso, Lorenzo <lorenzo.cianciaruso@tesco.com>
Subject: Query performance with min and filter

Hi Postgres community,

We seem to have found a situation where a query run using explain analyse or pgbench is incredibly fast, but run via Java under load performs very poorly – we’ve checked query performance metrics for our Postgres instance and can confirm that it’s the query running slowly, and not a Java problem.

The table has ~7 million rows, with total size of 44GB. As shown below, explain analyse gives a sub 1ms execution time, yet in practice the query takes an average of 2 seconds – it’s worth mentioning that a small percentage of requests do take ~1ms, so something non-deterministic is happening here.

The query in question is below, with explain analyse:

explain analyse SELECT coalesce (

(SELECT min(msg_offset) - 1 FROM events WHERE created_utc >= CURRENT_TIMESTAMP - INTERVAL '60 SECONDS' ),

(SELECT max(msg_offset) FROM events),

);

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Result (cost=2.03..2.04 rows=1 width=8) (actual time=0.183..0.183 rows=1 loops=1)

InitPlan 1 (returns $1)

-> Aggregate (cost=1.56..1.57 rows=1 width=8) (actual time=0.028..0.028 rows=1 loops=1)

-> Index Only Scan using test_4 on events (cost=0.44..1.56 rows=1 width=8) (actual time=0.023..0.023 rows=0 loops=1)

Index Cond: (created_utc >= (now() - '00:01:00'::interval))

Heap Fetches: 0

InitPlan 3 (returns $3)

-> Result (cost=0.45..0.46 rows=1 width=8) (actual time=0.153..0.153 rows=1 loops=1)

InitPlan 2 (returns $2)

-> Limit (cost=0.43..0.45 rows=1 width=8) (actual time=0.152..0.152 rows=1 loops=1)

-> Index Only Scan Backward using events_pkey on events events_1 (cost=0.43..156842.57 rows=7643099 width=8) (actual time=0.151..0.151 rows=1 loops=1)

Index Cond: (msg_offset IS NOT NULL)

Heap Fetches: 0

Planning time: 2.480 ms

Execution time: 0.242 ms

(15 rows)

Output of \d events, with a few test indexes added to try and improve performance:

Table "public.events"

Column | Type | Collation | Nullable | Default

--------------+-----------------------------+-----------+----------+--------------------------------------------

msg_offset | bigint | | not null | nextval('events_msg_offset_seq'::regclass)

msg_key | character varying | | not null |

content_type | character varying | | not null |

type | character varying | | not null |

created_utc | timestamp without time zone | | not null |

data | text | | |

event_size | integer | | not null | 1048576

cluster_id | bigint | | not null | 1

Indexes:

"events_pkey" PRIMARY KEY, btree (msg_offset)

"cluster_idx" btree (msg_key, cluster_id)

"cluster_type_filter_idx" btree (type, cluster_id)

"created_idx" btree (created_utc, msg_key)

"key_idx" btree (msg_key)

"test_1" btree (created_utc, msg_offset)

"test_2" btree (msg_offset, created_utc)

"test_3" btree (created_utc, msg_offset)

"test_4" btree (created_utc, msg_offset DESC)

"type_idx" btree (type)

Any help understanding what is happening here, or guidance to improve performance would be much appreciated.

Regards,

Adam

This is a confidential email. Tesco may monitor and record all emails. The views expressed in this email are those of the sender and not Tesco. Tesco Stores Limited Company Number: 519500 Registered in England Registered Office: Tesco House, Shire Park, Kestrel Way, Welwyn Garden City, AL7 1GA VAT Registration Number: GB 220 4302 31

pgsql-general by date:

From: Thorsten Schöning
Date: 31 August 2020, 13:37:00
Subject: Re: How bad is using queries with thousands of values for operators IN or ANY?

From: Dirk Krautschick
Date: 31 August 2020, 14:06:01
Subject: High Availability, guarantee to use sync nodes

Re: Query performance with min and filter - Mailing list pgsql-general

Previous

Next