Re: Top -N Query performance issue and high CPU usage - Mailing list pgsql-general

From Ron Johnson
Subject Re: Top -N Query performance issue and high CPU usage
Date
Msg-id CANzqJaAE6bevjj43W-94Bx1xGB9OPbUtOyXV2kaGpju1wt3k+A@mail.gmail.com
Whole thread
In response to Re: Top -N Query performance issue and high CPU usage  (yudhi s <learnerdatabase99@gmail.com>)
Responses Re: Top -N Query performance issue and high CPU usage
List pgsql-general
On Tue, Feb 3, 2026 at 4:26 AM yudhi s <learnerdatabase99@gmail.com> wrote:
On Tue, Feb 3, 2026 at 4:50 AM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
On Mon, Feb 2, 2026 at 3:43 PM yudhi s <learnerdatabase99@gmail.com> wrote:

On Tue, Feb 3, 2026 at 1:01 AM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
On Mon, Feb 2, 2026 at 1:39 PM yudhi s <learnerdatabase99@gmail.com> wrote:
On Mon, Feb 2, 2026 at 8:57 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:

My apologies if i misunderstand the plan, But If I see,   it's spending ~140ms(140ms-6ms) i.e. almost all the time now, in performing the below nested loop join. So my question was , is there any possibility to reduce the resource consumption or response time further here?  Hope my understanding is correct here.

-> Nested Loop (cost=266.53..1548099.38 rows=411215 width=20) (actual time=6.009..147.695 rows=1049 loops=1)
Join Filter: ((df.ent_id)::numeric = m.ent_id)
Rows Removed by Join Filter: 513436
Buffers: shared hit=1939

I don't see m.ent_id in the actual query.  Did you only paste a portion of the query?

Also, casting in a JOIN typically brutalizes the ability to use an index.


Thank you.
Actually i tried executing the first two CTE where the query was spending most of the time  and teh alias has changed.

We need to see everything, not just what you think is relevant.
 
Also here i have changed the real table names before putting it here, hope that is fine. 
However , i verified the data type of the ent_id column in "ent" its "int8" and in table "txn_tbl" is "numeric 12", so do you mean to say this difference in the data type is causing this high response time during the nested loop join? My understanding was it will be internally castable without additional burden. Also, even i tried creating an index on the "(df.ent_id)::numeric" its still reulting into same plan and response time. 

If you'd shown the "\d" table definitions like Adrian asked two days ago, we'd know what indexes are on each table, and not have to beg you to dispense dribs and drabs of information.


I am unable to run "\d" from the dbeaver sql worksheet. However,  I have fetched the DDL for the three tables and their selected columns, used in the smaller version of the query and its plan , which I recently updated. 



Lines 30-32 are where most of the time and effort are taken.

I can't be certain, but changing APP_schema.ent.ent_id from NUMERIC to int8 (with a CHECK constraint to, well, constrain it to 12 digits, if really necessary) is something I'd test.

--



Hmm.  What does pg_stat_user_tables say about when you last analyzed and vacuumed APP_schema.txn_tbl and APP_schema.ent?

Beyond "aggressively keep those two tables analyzed, via reducing autovacuum_analyze_scale_factor to something like 0.05, and adding 'vacuumdb -d mumble -j2 --analyze-only -t APP_schema.txn_tbl -t APP_schema.ent' to crontab", I'm out of ideas.  An 85% speed improvement is nothing to sneeze at, though.
 

There is no VARCHAR or CHAR; there is only TEXT.  Thus, this is 100% expected and normal.

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

pgsql-general by date:

Previous
From: yudhi s
Date:
Subject: Re: Top -N Query performance issue and high CPU usage
Next
From: Adrian Klaver
Date:
Subject: Re: Top -N Query performance issue and high CPU usage