Thread: Re: Performance Issue with Hash Partition Query Execution in PostgreSQL 16
Re: Performance Issue with Hash Partition Query Execution in PostgreSQL 16
Hi Team,One of the queries, which retrieves a single record from a table with 16 hash partitions, is taking more than 10 seconds to execute. In contrast, when we run the same query manually, it completes within milliseconds. This issue is causing exhaustion of the application pools. Do we have any bugs in postgrs16 hash partitions? Please find the attached log, table, and execution plan.
size of the each partitions : 300GB
Index Size : 12GBPostgres Version : 16.x
Shared Buffers : 75 GB
Effective_cache : 175 GB
Work _mem : 4MB
Max_connections : 3000OS : Ubuntu 22.04
Ram : 384 GB
CPU : 64Please let us know if you need any further information or if there are additional details required.
Regards,Ram.
It might be worth eliminating the use of cached plans here. Is your app using prepared statements at all?Point is that if the optimizer sees the same prepared query , 5 times, the it locks the plan that it found at that time. This is a good trade off as it avoids costly planning-time for repetitive queries. But if you are manually querying, the a custom plan will be generated anew.A quick analyze of the table should reset the stats and invalidate any cached plans.This may not be your problem just worth eliminating it from the list of potential causes.On Wed, 6 Nov 2024, 17:14 Ramakrishna m, <ram.pgdb@gmail.com> wrote:Hi Team,One of the queries, which retrieves a single record from a table with 16 hash partitions, is taking more than 10 seconds to execute. In contrast, when we run the same query manually, it completes within milliseconds. This issue is causing exhaustion of the application pools. Do we have any bugs in postgrs16 hash partitions? Please find the attached log, table, and execution plan.
size of the each partitions : 300GB
Index Size : 12GBPostgres Version : 16.x
Shared Buffers : 75 GB
Effective_cache : 175 GB
Work _mem : 4MB
Max_connections : 3000OS : Ubuntu 22.04
Ram : 384 GB
CPU : 64Please let us know if you need any further information or if there are additional details required.
Regards,Ram.
Re: Performance Issue with Hash Partition Query Execution in PostgreSQL 16
Hi ,Thanks for the suggestions.Two more observations:1) no sequence scan noticed from pg_stat_user_tables ( hope stats are accurate in postgres 16) if parameter sniffing happens the possibility of going to sequence scan is more right.2) no blockings or IO issue during the time.3) even with limit clause if touch all partitions also it could have been completed in milliseconds as this is just one record.4) auto_explain in prod we cannot enable as this is expensive and with high TPS we may face latency issues and lower environment this issue cannot be reproduced,( this is happening out of Million one case)This looks puzzle to us, just in case anyone experianced pls share your experience.Regards,RaviOn Thu, 7 Nov, 2024, 3:41 am David Mullineux, <dmullx@gmail.com> wrote:It might be worth eliminating the use of cached plans here. Is your app using prepared statements at all?Point is that if the optimizer sees the same prepared query , 5 times, the it locks the plan that it found at that time. This is a good trade off as it avoids costly planning-time for repetitive queries. But if you are manually querying, the a custom plan will be generated anew.A quick analyze of the table should reset the stats and invalidate any cached plans.This may not be your problem just worth eliminating it from the list of potential causes.On Wed, 6 Nov 2024, 17:14 Ramakrishna m, <ram.pgdb@gmail.com> wrote:Hi Team,One of the queries, which retrieves a single record from a table with 16 hash partitions, is taking more than 10 seconds to execute. In contrast, when we run the same query manually, it completes within milliseconds. This issue is causing exhaustion of the application pools. Do we have any bugs in postgrs16 hash partitions? Please find the attached log, table, and execution plan.
size of the each partitions : 300GB
Index Size : 12GBPostgres Version : 16.x
Shared Buffers : 75 GB
Effective_cache : 175 GB
Work _mem : 4MB
Max_connections : 3000OS : Ubuntu 22.04
Ram : 384 GB
CPU : 64Please let us know if you need any further information or if there are additional details required.
Regards,Ram.
Just spotted a potential problem. The indexed column is a bigint. Are you, in your prepared statement passing a string or a big int ?I notice your plan is doing an implicit type conversion when you run it manually.Sometimes the wrong type will make it not use the index.On Fri, 8 Nov 2024, 03:07 ravi k, <ravisql09@gmail.com> wrote:Hi ,Thanks for the suggestions.Two more observations:1) no sequence scan noticed from pg_stat_user_tables ( hope stats are accurate in postgres 16) if parameter sniffing happens the possibility of going to sequence scan is more right.2) no blockings or IO issue during the time.3) even with limit clause if touch all partitions also it could have been completed in milliseconds as this is just one record.4) auto_explain in prod we cannot enable as this is expensive and with high TPS we may face latency issues and lower environment this issue cannot be reproduced,( this is happening out of Million one case)This looks puzzle to us, just in case anyone experianced pls share your experience.Regards,RaviOn Thu, 7 Nov, 2024, 3:41 am David Mullineux, <dmullx@gmail.com> wrote:It might be worth eliminating the use of cached plans here. Is your app using prepared statements at all?Point is that if the optimizer sees the same prepared query , 5 times, the it locks the plan that it found at that time. This is a good trade off as it avoids costly planning-time for repetitive queries. But if you are manually querying, the a custom plan will be generated anew.A quick analyze of the table should reset the stats and invalidate any cached plans.This may not be your problem just worth eliminating it from the list of potential causes.On Wed, 6 Nov 2024, 17:14 Ramakrishna m, <ram.pgdb@gmail.com> wrote:Hi Team,One of the queries, which retrieves a single record from a table with 16 hash partitions, is taking more than 10 seconds to execute. In contrast, when we run the same query manually, it completes within milliseconds. This issue is causing exhaustion of the application pools. Do we have any bugs in postgrs16 hash partitions? Please find the attached log, table, and execution plan.
size of the each partitions : 300GB
Index Size : 12GBPostgres Version : 16.x
Shared Buffers : 75 GB
Effective_cache : 175 GB
Work _mem : 4MB
Max_connections : 3000OS : Ubuntu 22.04
Ram : 384 GB
CPU : 64Please let us know if you need any further information or if there are additional details required.
Regards,Ram.
Re: Performance Issue with Hash Partition Query Execution in PostgreSQL 16
One hypothesis is that, a bad plan got cached for that SQL pattern. Obviously, when you run it manually you are always getting a custom plan as it's not a prepared statement.
Sorry, it was typo. Bind variable is bigint only.ThanksOn Fri, 8 Nov, 2024, 7:09 pm David Mullineux, <dmullx@gmail.com> wrote:Just spotted a potential problem. The indexed column is a bigint. Are you, in your prepared statement passing a string or a big int ?I notice your plan is doing an implicit type conversion when you run it manually.Sometimes the wrong type will make it not use the index.On Fri, 8 Nov 2024, 03:07 ravi k, <ravisql09@gmail.com> wrote:Hi ,Thanks for the suggestions.Two more observations:1) no sequence scan noticed from pg_stat_user_tables ( hope stats are accurate in postgres 16) if parameter sniffing happens the possibility of going to sequence scan is more right.2) no blockings or IO issue during the time.3) even with limit clause if touch all partitions also it could have been completed in milliseconds as this is just one record.4) auto_explain in prod we cannot enable as this is expensive and with high TPS we may face latency issues and lower environment this issue cannot be reproduced,( this is happening out of Million one case)This looks puzzle to us, just in case anyone experianced pls share your experience.Regards,RaviOn Thu, 7 Nov, 2024, 3:41 am David Mullineux, <dmullx@gmail.com> wrote:It might be worth eliminating the use of cached plans here. Is your app using prepared statements at all?Point is that if the optimizer sees the same prepared query , 5 times, the it locks the plan that it found at that time. This is a good trade off as it avoids costly planning-time for repetitive queries. But if you are manually querying, the a custom plan will be generated anew.A quick analyze of the table should reset the stats and invalidate any cached plans.This may not be your problem just worth eliminating it from the list of potential causes.On Wed, 6 Nov 2024, 17:14 Ramakrishna m, <ram.pgdb@gmail.com> wrote:Hi Team,One of the queries, which retrieves a single record from a table with 16 hash partitions, is taking more than 10 seconds to execute. In contrast, when we run the same query manually, it completes within milliseconds. This issue is causing exhaustion of the application pools. Do we have any bugs in postgrs16 hash partitions? Please find the attached log, table, and execution plan.
size of the each partitions : 300GB
Index Size : 12GBPostgres Version : 16.x
Shared Buffers : 75 GB
Effective_cache : 175 GB
Work _mem : 4MB
Max_connections : 3000OS : Ubuntu 22.04
Ram : 384 GB
CPU : 64Please let us know if you need any further information or if there are additional details required.
Regards,Ram.
Thanks for correction. At this point I would be trying to modifyplan_cache_modefor the session which uses the bond variable. alter it so that plan_cache_mode=force_custom_plan
One hypothesis is that, a bad plan got cached for that SQL pattern. Obviously, when you run it manually you are always getting a custom plan as it's not a prepared statement.On Sat, 9 Nov 2024, 03:46 ravi k, <ravisql09@gmail.com> wrote:Sorry, it was typo. Bind variable is bigint only.ThanksOn Fri, 8 Nov, 2024, 7:09 pm David Mullineux, <dmullx@gmail.com> wrote:Just spotted a potential problem. The indexed column is a bigint. Are you, in your prepared statement passing a string or a big int ?I notice your plan is doing an implicit type conversion when you run it manually.Sometimes the wrong type will make it not use the index.On Fri, 8 Nov 2024, 03:07 ravi k, <ravisql09@gmail.com> wrote:Hi ,Thanks for the suggestions.Two more observations:1) no sequence scan noticed from pg_stat_user_tables ( hope stats are accurate in postgres 16) if parameter sniffing happens the possibility of going to sequence scan is more right.2) no blockings or IO issue during the time.3) even with limit clause if touch all partitions also it could have been completed in milliseconds as this is just one record.4) auto_explain in prod we cannot enable as this is expensive and with high TPS we may face latency issues and lower environment this issue cannot be reproduced,( this is happening out of Million one case)This looks puzzle to us, just in case anyone experianced pls share your experience.Regards,RaviOn Thu, 7 Nov, 2024, 3:41 am David Mullineux, <dmullx@gmail.com> wrote:It might be worth eliminating the use of cached plans here. Is your app using prepared statements at all?Point is that if the optimizer sees the same prepared query , 5 times, the it locks the plan that it found at that time. This is a good trade off as it avoids costly planning-time for repetitive queries. But if you are manually querying, the a custom plan will be generated anew.A quick analyze of the table should reset the stats and invalidate any cached plans.This may not be your problem just worth eliminating it from the list of potential causes.On Wed, 6 Nov 2024, 17:14 Ramakrishna m, <ram.pgdb@gmail.com> wrote:Hi Team,One of the queries, which retrieves a single record from a table with 16 hash partitions, is taking more than 10 seconds to execute. In contrast, when we run the same query manually, it completes within milliseconds. This issue is causing exhaustion of the application pools. Do we have any bugs in postgrs16 hash partitions? Please find the attached log, table, and execution plan.
size of the each partitions : 300GB
Index Size : 12GBPostgres Version : 16.x
Shared Buffers : 75 GB
Effective_cache : 175 GB
Work _mem : 4MB
Max_connections : 3000OS : Ubuntu 22.04
Ram : 384 GB
CPU : 64Please let us know if you need any further information or if there are additional details required.
Regards,Ram.