Home > mailing lists

Re: Top -N Query performance issue and high CPU usage - Mailing list pgsql-general

From	yudhi s
Subject	Re: Top -N Query performance issue and high CPU usage
Date	February 5 11:35:49
Msg-id	CAEzWdqd2ELrUo1xnrFaB9vCicjTuHk+KEbXEaiyX0jp1f3M+tQ@mail.gmail.com Whole thread
In response to	Re: Top -N Query performance issue and high CPU usage (felix.quintgz@yahoo.com)
List	pgsql-general

Tree view

On Wed, Feb 4, 2026 at 9:18 PM <felix.quintgz@yahoo.com> wrote:

Have you tried adding an index to txn_tbl.txn_type?
And a vacuum on all tables? It seems the visibility map is outdated.

I'm using https://explain.dalibo.com to view the plan visually; it's more convenient.

You could use the option to periodically save the results of queries with common filters to another table, and then retrieve the results from that table when a user performs a query with their own filters.
You should also store the user's query results somewhere for a while to prevent excessive database access.

I imagine this is some kind of dashboard that each user is taken to after authenticating. It looks nice in presentations, but after a while in production, it can make the system unusable. I had to remove similar charts from the homepage of a system because after a year of work, they were taking a minute to load.

On Saturday, January 31, 2026 at 08:30:33 AM GMT-5, yudhi s <learnerdatabase99@gmail.com> wrote:
Hello Experts,
We have a "Select" query which is using three to five main transaction tables (txn_tbl, txn_status, txn_decision, txn_sale, ath) holding ~2million rows in each of them(which is going to increase to have ~50-100million in future) and others(6-7) tables out of which some are master and some other small tables.

When we are running this query , and it's taking ~2-3seconds , however when we hit this query from 10-15 session at same time its causing CPU spike up to ~50-60% for the DB instance and this is increasing and touching 90% when we are increasing the hits further to 40-50 times concurrently.

This query is going to be called in the first page of an UI screen and is supposed to show the latest 1000 rows based on a certain transaction date. This query is supposed to allow thousands of users to hit this same query at the first landing page at the same time.

Its postgres version 17. The instance has 2-VCPU and 16GB RAM.

I have the following questions.

1)Why is this query causing a high cpu spike ,if there is any way in postgres to understand what part/line of the query is contributing to the high cpu time?
2)How can we tune this query to further reduce response time and mainly CPU consumption ? Is any additional index or anything will make this plan better further?
3) Is there any guidance or best practices exists , to create/design top N-queries for such UI scenarios where performance is an important factor?
4)And based on the CPU core and memory , is there any calculation by using which , we can say that this machine can support a maximum N number of concurrent queries of such type beyond which we need more cpu cores machines?
Below is the query and its current plan:-https://gist.github.com/databasetech0073/6688701431dc4bf4eaab8d345c1dc65f
RegardsYudhi

As folks suggested , adding an index on "tran_date" and combining the CTE to two, and making the data type equal for the "ent_id" has helped reduce the response to a large extent. Now I am trying to see if we can reduce any further. As most of the time(100-20=~80ms) is now on materialize loop which is happening 43K times.

Also thinking if adding "txn_tbl_type_nm" column to the index i.e. composite index on (tran_date,txn_tbl_type_nm) will be advisable , in cases where , ~500K rows will be filtered by the txn_tbl_type_nm filter criteria (currently its just 17 rows getting filtered though for this case).

https://gist.github.com/databasetech0073/558377c1939a9291e7b72b1cbac7c9f9

-> Nested Loop (cost=263.20..1680202.56 rows=483106 width=20) (actual time=6.421..111.220 rows=1000 loops=1)

Buffers: shared hit=6168

-> Nested Loop (cost=262.77..1342550.91 rows=579149 width=20) (actual time=6.406..107.946 rows=1049 loops=1)

Join Filter: (df.ent_id = m.ent_id)

Rows Removed by Join Filter: 514648

Buffers: shared hit=1972

-> Index Scan Backward using txn_tbl_due_dt_idx on txn_tbl df (cost=0.43..115879.87 rows=1419195 width=20) (actual time=0.019..20.377 rows=43727 loops=1)

Filter: ((txn_tbl_type_nm)::text = ANY ('{TYPE1,TYPE2,TYPE3}'::text[]))

Rows Removed by Filter: 17

Buffers: shared hit=1839

-> Materialize (cost=262.35..364.01 rows=58 width=8) (actual time=0.000..0.001 rows=12 loops=43727)

Buffers: shared hit=133

Regards

Yudhi

pgsql-general by date:

From: yudhi s
Date: 05 February, 11:20:26
Subject: Re: Top -N Query performance issue and high CPU usage

From: Thiemo Kellner
Date: 05 February, 14:32:03
Subject: Re: Top -N Query performance issue and high CPU usage

Re: Top -N Query performance issue and high CPU usage - Mailing list pgsql-general

Previous

Next