Thread: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4
tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4
Dear All,
we have hit a serious performance regression going from 10.23 → 16.4 as far as tds_fdw (MS SQL) FDW is concerned. To cut the long story short, I recreated the good fast “old” (pgsql 10) setup on the same vm as the slow “new” (pgsql 16). Here is the bug report on github :
On 11/8/24 20:32, Achilleas Mantzios wrote: > Dear All, > > we have hit a serious performance regression going from 10.23 → 16.4 as > far as tds_fdw (MS SQL) FDW is concerned. To cut the long story short, I > recreated the good fast “old” (pgsql 10) setup on the same vm as the > slow “new” (pgsql 16). Here is the bug report on github : > > https://github.com/tds-fdw/tds_fdw/issues/371 > > All environment on the two pgsql clusters is shared (freetds version, > tds_fdw, gcc, llvm). Only thing differs are pgsql versions. The speed on > the old pgsql 10.23 is about 10 to 20 times higher than pgsql 16.4 . | > Setting client_min_messages TO debug3 does yield identical output on the > two systems. > | > > The new pgsql 16.4 shows 100% CPU usage during the query execution. > > I know we are pretty much alone with this, but it would be very nice if > anyone could help, see smth we are missing or guide us via the right path. > I have little experience with tds_fdw, and can't investigate that locally. But it might be interesting to compare CPU profiles for the two (slow and fast) cases. Chances are the difference will be an indication regarding what got that slower. It might be something in PG or in the FDW, hard to say. regards -- Tomas Vondra
Re: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4
Στις 9/11/24 12:49, ο/η Tomas Vondra έγραψε: > On 11/8/24 20:32, Achilleas Mantzios wrote: >> Dear All, >> >> we have hit a serious performance regression going from 10.23 → 16.4 as >> far as tds_fdw (MS SQL) FDW is concerned. To cut the long story short, I >> recreated the good fast “old” (pgsql 10) setup on the same vm as the >> slow “new” (pgsql 16). Here is the bug report on github : >> >> https://github.com/tds-fdw/tds_fdw/issues/371 >> >> All environment on the two pgsql clusters is shared (freetds version, >> tds_fdw, gcc, llvm). Only thing differs are pgsql versions. The speed on >> the old pgsql 10.23 is about 10 to 20 times higher than pgsql 16.4 . | >> Setting client_min_messages TO debug3 does yield identical output on the >> two systems. >> | >> >> The new pgsql 16.4 shows 100% CPU usage during the query execution. >> >> I know we are pretty much alone with this, but it would be very nice if >> anyone could help, see smth we are missing or guide us via the right path. >> > I have little experience with tds_fdw, and can't investigate that > locally. But it might be interesting to compare CPU profiles for the two > (slow and fast) cases. Chances are the difference will be an indication > regarding what got that slower. It might be something in PG or in the > FDW, hard to say. Thank you Tomas, do you have linux-perf in mind? How could we get the CPU profiles of the two ? Just in case it was not clear, the tests were done on the same VM, at the same time, having the two DBs listening on different ports. Kinda spooky from my part to do that on the production system, but I wanted everything to be the same (latency to/from the ms sql, etc) except the pgsql version, to prove my assumption, and it seems pgsql version is what makes the difference (in conjunction of course with tds_fdw ) . > > regards >
Re: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4
Στις 9/11/24 16:05, ο/η Achilleas Mantzios έγραψε: > > Στις 9/11/24 12:49, ο/η Tomas Vondra έγραψε: >> On 11/8/24 20:32, Achilleas Mantzios wrote: >>> Dear All, >>> >>> we have hit a serious performance regression going from 10.23 → 16.4 as >>> far as tds_fdw (MS SQL) FDW is concerned. To cut the long story >>> short, I >>> recreated the good fast “old” (pgsql 10) setup on the same vm as the >>> slow “new” (pgsql 16). Here is the bug report on github : >>> >>> https://github.com/tds-fdw/tds_fdw/issues/371 >>> >>> All environment on the two pgsql clusters is shared (freetds version, >>> tds_fdw, gcc, llvm). Only thing differs are pgsql versions. The >>> speed on >>> the old pgsql 10.23 is about 10 to 20 times higher than pgsql 16.4 . | >>> Setting client_min_messages TO debug3 does yield identical output on >>> the >>> two systems. >>> | >>> >>> The new pgsql 16.4 shows 100% CPU usage during the query execution. >>> >>> I know we are pretty much alone with this, but it would be very nice if >>> anyone could help, see smth we are missing or guide us via the right >>> path. >>> >> I have little experience with tds_fdw, and can't investigate that >> locally. But it might be interesting to compare CPU profiles for the two >> (slow and fast) cases. Chances are the difference will be an indication >> regarding what got that slower. It might be something in PG or in the >> FDW, hard to say. > Thank you Tomas, do you have linux-perf in mind? How could we get the > CPU profiles of the two ? Just in case it was not clear, the tests > were done on the same VM, at the same time, having the two DBs > listening on different ports. Kinda spooky from my part to do that on > the production system, but I wanted everything to be the same (latency > to/from the ms sql, etc) except the pgsql version, to prove my > assumption, and it seems pgsql version is what makes the difference > (in conjunction of course with tds_fdw ) . Meaning there are CPP directives in the tds_fdw source making it behave differently according to the Postgresql version. The two tds_fdw's have the same version. >> >> regards >> > >
On 11/9/24 15:05, Achilleas Mantzios wrote: > > Στις 9/11/24 12:49, ο/η Tomas Vondra έγραψε: >> On 11/8/24 20:32, Achilleas Mantzios wrote: >>> Dear All, >>> >>> we have hit a serious performance regression going from 10.23 → 16.4 as >>> far as tds_fdw (MS SQL) FDW is concerned. To cut the long story short, I >>> recreated the good fast “old” (pgsql 10) setup on the same vm as the >>> slow “new” (pgsql 16). Here is the bug report on github : >>> >>> https://github.com/tds-fdw/tds_fdw/issues/371 >>> >>> All environment on the two pgsql clusters is shared (freetds version, >>> tds_fdw, gcc, llvm). Only thing differs are pgsql versions. The speed on >>> the old pgsql 10.23 is about 10 to 20 times higher than pgsql 16.4 . | >>> Setting client_min_messages TO debug3 does yield identical output on the >>> two systems. >>> | >>> >>> The new pgsql 16.4 shows 100% CPU usage during the query execution. >>> >>> I know we are pretty much alone with this, but it would be very nice if >>> anyone could help, see smth we are missing or guide us via the right >>> path. >>> >> I have little experience with tds_fdw, and can't investigate that >> locally. But it might be interesting to compare CPU profiles for the two >> (slow and fast) cases. Chances are the difference will be an indication >> regarding what got that slower. It might be something in PG or in the >> FDW, hard to say. > Thank you Tomas, do you have linux-perf in mind? How could we get the > CPU profiles of the two ? Just in case it was not clear, the tests were > done on the same VM, at the same time, having the two DBs listening on > different ports. Kinda spooky from my part to do that on the production > system, but I wanted everything to be the same (latency to/from the ms > sql, etc) except the pgsql version, to prove my assumption, and it seems > pgsql version is what makes the difference (in conjunction of course > with tds_fdw ) . >> Yes, I mean linux-perf. There's a wikipage with some basic instructions: https://wiki.postgresql.org/wiki/Profiling_with_perf But in short, I'd do this: 1) use pg_backend_pid() to get PID of the backend SELECT pg_backend_pid(); 2) start recording profile for that PID, say for 1 minute perf record -g -p $PID -- sleep 60 3) run the query in a loop SELECT count(*) as foo FROM mssql_bdynacom."ACDOC" \watch 1 4) once the recording stops, generate report perf report > report.txt Do this for both versions, share the output txt files. You may need to install additional packages with debug symbols to get better profiles, and stuff like that. Also, don't forget to share the explain plans. regard -- Tomas Vondra
Re: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4
Στις 9/11/24 17:41, ο/η Tomas Vondra έγραψε: > > On 11/9/24 15:05, Achilleas Mantzios wrote: >> Στις 9/11/24 12:49, ο/η Tomas Vondra έγραψε: >>> On 11/8/24 20:32, Achilleas Mantzios wrote: >>>> Dear All, >>>> >>>> we have hit a serious performance regression going from 10.23 → 16.4 as >>>> far as tds_fdw (MS SQL) FDW is concerned. To cut the long story short, I >>>> recreated the good fast “old” (pgsql 10) setup on the same vm as the >>>> slow “new” (pgsql 16). Here is the bug report on github : >>>> >>>> https://github.com/tds-fdw/tds_fdw/issues/371 >>>> >>>> All environment on the two pgsql clusters is shared (freetds version, >>>> tds_fdw, gcc, llvm). Only thing differs are pgsql versions. The speed on >>>> the old pgsql 10.23 is about 10 to 20 times higher than pgsql 16.4 . | >>>> Setting client_min_messages TO debug3 does yield identical output on the >>>> two systems. >>>> | >>>> >>>> The new pgsql 16.4 shows 100% CPU usage during the query execution. >>>> >>>> I know we are pretty much alone with this, but it would be very nice if >>>> anyone could help, see smth we are missing or guide us via the right >>>> path. >>>> >>> I have little experience with tds_fdw, and can't investigate that >>> locally. But it might be interesting to compare CPU profiles for the two >>> (slow and fast) cases. Chances are the difference will be an indication >>> regarding what got that slower. It might be something in PG or in the >>> FDW, hard to say. >> Thank you Tomas, do you have linux-perf in mind? How could we get the >> CPU profiles of the two ? Just in case it was not clear, the tests were >> done on the same VM, at the same time, having the two DBs listening on >> different ports. Kinda spooky from my part to do that on the production >> system, but I wanted everything to be the same (latency to/from the ms >> sql, etc) except the pgsql version, to prove my assumption, and it seems >> pgsql version is what makes the difference (in conjunction of course >> with tds_fdw ) . > Yes, I mean linux-perf. There's a wikipage with some basic instructions: > > https://wiki.postgresql.org/wiki/Profiling_with_perf > > But in short, I'd do this: > > 1) use pg_backend_pid() to get PID of the backend > > SELECT pg_backend_pid(); > > 2) start recording profile for that PID, say for 1 minute > > perf record -g -p $PID -- sleep 60 > > 3) run the query in a loop > > SELECT count(*) as foo FROM mssql_bdynacom."ACDOC" \watch 1 > > 4) once the recording stops, generate report > > perf report > report.txt > > > Do this for both versions, share the output txt files. You may need to > install additional packages with debug symbols to get better profiles, > and stuff like that. > > Also, don't forget to share the explain plans. Thank you Tomas, as soon as I get my hands on the new VM I will do as you suggested and report back. > > regard >
On 11/12/24 11:37, Achilleas Mantzios - cloud wrote: > > ... > > We run perf on both systems for 90 seconds to make sure that it included > at least two runs of the slow system, and we attach both outputs. We run > perf as : > > sudo perf record -g -p <PID> -- sleep 90 > > it strikes me that we dont find similarities between the two, but we are > new to perf. > Not sure, but it seems the slow profile has a lot of writes. Not sure why. Do both instances have the same work_mem value / available memory, storatge? Maybe EXPLAIN VERBOSE would show if the remote query is the same ... Maybe try setting log_temp_files=0 before running the query. Writing the foreign scan result set into a temp file could be happening on one of the machines only. regards -- Tomas Vondra
Re: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4
In fact the plain vanilla installation, with all the defaults, runs this fast, please read further.On 11/12/24 11:37, Achilleas Mantzios - cloud wrote:... We run perf on both systems for 90 seconds to make sure that it included at least two runs of the slow system, and we attach both outputs. We run perf as : sudo perf record -g -p <PID> -- sleep 90 it strikes me that we dont find similarities between the two, but we are new to perf.Not sure, but it seems the slow profile has a lot of writes. Not sure why. Do both instances have the same work_mem value / available memory, storatge?
Maybe EXPLAIN VERBOSE would show if the remote query is the same ... Maybe try setting log_temp_files=0 before running the query. Writing the foreign scan result set into a temp file could be happening on one of the machines only.
No temp files, the remote query shown by EXPLAIN (... VERBOSE) and ms sql it self is :
Remote query: SELECT NULL FROM [db_ro_non_delosnav].[ACDOC]
Aggregate (cost=168359742.33..168359742.34 rows=1 width=8) (actual time=15100.248..15100.250 rows=1 loops=1)
Output: count(*)
-> Foreign Scan on mssql_bdynacom."ACDOC" (cost=200.00..168355533.86 rows=1683386 width=100) (actual time=3.783..14967.029 rows=1683386 loops=1)
Output: "ID", "DOC_TYPE", "PARAST", "TRNS_DATE", "VIA_MNG", "VIA_MNG_ID", "FL_UPD", "NOTES", "REM_DATE", "REM_CLS", "OPN_CLS", "ACSITE_ID", "USERS_ID", "LAST_UPD_DA
TE", "LAST_UPD_USERS_ID", "OTH_TYPE", "OTH_TYPE_DESC", "OTH_APPL", "FL_PRN", "CREATE_DATE", "LOG_NOTE", "PER_ID", "ACDEPT_ID", "FL_DLT", "CMP_ID", "INTPER_ID", "FL_INT_TRNCL
S", "FL_ACTV", "OLD_ACDEPT_ID", "OLD_USERS_ID", "OLD_TRNS_DATE", "OLD_FL_UPD_DATE", "FL_FNLZ", "ERROR_NOTES", "FL_ATTCH", "OPDOC_ID", "FL_ONHOLD", "ONHOLD_NOTES", "FL_RVS",
"RVS_DOC_ID", "FL_REV", "REV_NO", "REV_DOC_ID", "REV_NOTES", "REV_DATE", "REV_USER_ID", "FL_PROT", "INTGR_DOC_ID", "INTGR_DOC_TP", "P_DOC_ID", "FL_OPN", "FL_MDL", "USERUPDAC
C_ID", "DATE_UPDACC", "USERIN_ID", "USERUPD_ID", "DATE_IN", "DATE_UPD", "DOC_SUBTYPE", "DOC_REF", "FRTRN_TYPE", "FL_ACC", "ATCH_DIR", "ATCH_REM", "KEYID", "OTH_APPL1", "RGST
R_ID", "ACDOC_LIST_NO", "EXEC_DATE", "EXEC_USER_ID", "EXEC_NOTES", "CNL_EXEC_DATE", "CNL_EXEC_USER_ID", "FL_EXEC", "CNL_EXEC_NOTES", "LOG_NOTE1", "USERREUPDACC_ID", "DATE_RE
UPDACC", "P_ACDOC_ID", "FL_AUTHORITY_APRV", "FL_APPROVE_TYPE", "FL_NOTINCLUDE_INAUTH", "FL_NO_OTHERCURRENCY", "FL_WFSTATUS", "FL_APRV", "USRACTV_LOG_ID", "USERIN_DNAME", "US
ERUPD_DNAME", "UROLEIN_ID", "UROLEIN_NAME", "UDEPTIN_ID", "UDEPTIN_NAME", "FL_LOAD_5805", "UROLEUPD_ID", "UROLEUPD_NAME", "CNTRY_ID", fl_togetback, "FULLPATH", "FL_UPD_OLD",
"FL_REOINTG", "FL_OINTGR_RESEND"
Remote query: SELECT NULL FROM [db_ro_non_delosnav].[ACDOC]
Query Identifier: 6812542821581303630
Planning Time: 14029.724 ms
JIT:
Functions: 2
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 0.168 ms, Inlining 2.087 ms, Optimization 3.301 ms, Emission 3.564 ms, Total 9.121 ms
Execution Time: 15102.803 ms
(12 rows)
I gather this is how it computes the plan.
I think we have narrowed down the problem, and this is extremely strange :
It is not a matter of pgsql version 10 or 16, it is a matter of the postgres user, the user that owns the data dir(s) and the user of the postgres process. We reproduced both the problem and the solution with all combinations of versions.
To sum it up :
Our sysadm created the system debian Debian GNU/Linux 12 (bookworm) with postgres as user 1000. Now at some point we realized that whenever we run a pgsql cluster with another user (I found that after spending two good days testing), the above query runs in about 1 second. With user postgres 1000 in 30 seconds. As you saw the perf output are completely different.
On the test VM, we removed the postgres user, recreated with uid=1003, chown -R all the mount points + table spaces, started postgres with the new postgres user , and bingo . The chown -R on the 5TB is instant... just saying ...
We are puzzled what can be causing this. Tomorrow we dig into GDB , *trace and the like.
If this rings any bells we would be more than grateful to know.
Also, the worse, is the suspicion that maybe our whole infra performance is affected. We hope it is only free-tds and tds_fdw .
regards
Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> writes: > Our sysadm created the system debian Debian GNU/Linux 12 (bookworm)with > postgres as user 1000. Now at some point we realized that whenever we > run a pgsql cluster with another user (I found that after spending two > good days testing), the above query runs in about 1 second. With user > postgres 1000 in 30 seconds. As you saw the perf output are completely > different. Don't recall details offhand, but in some situations where the calling SQL user doesn't have permissions to read particular columns, the planner will not consult statistics for those columns. That can lead to a different, less optimal plan being used. Maybe something like that is happening here? regards, tom lane
On 11/12/24 20:37, Tom Lane wrote: > Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> writes: >> Our sysadm created the system debian Debian GNU/Linux 12 (bookworm)with >> postgres as user 1000. Now at some point we realized that whenever we >> run a pgsql cluster with another user (I found that after spending two >> good days testing), the above query runs in about 1 second. With user >> postgres 1000 in 30 seconds. As you saw the perf output are completely >> different. > > Don't recall details offhand, but in some situations where the calling > SQL user doesn't have permissions to read particular columns, the > planner will not consult statistics for those columns. That can lead > to a different, less optimal plan being used. Maybe something like > that is happening here? > I don't know, the query is pretty trivial, and the estimates seemed exactly the same in both cases. And it shouldn't affect how the query gets planned on the MSSQL side. But this seems really strange: Planning Time: 14029.724 ms ... Execution Time: 15102.803 ms It's not about the execution, it's about the planning. I have no idea why should the planning take this long, except maybe for waiting for a lock, or something like that. But that's not really consistent with the profile ... it's weird. I'm not familiar with tds_fdw, but I see there are a bunch of table options [1] that might affect this, namely: * use_remote_estimate * local_tuple_estimate * row_estimate_method (defaults to 'execute') Are you sure these are set to the same value on both machines? Wild random guesses: 1) Could you try running the query with jit=off? 2) Did you run ANALYZE on the foreign table? Could matter when not using remote estimates (use_remote_estimate=false). 3) Could it be some sort of memory pressure/swapping? But that would look different in the profile, AFAIK. regards [1] https://github.com/tds-fdw/tds_fdw/blob/master/ForeignTableCreation.md -- Tomas Vondra
Re: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4
On 11/12/24 21:37, Tom Lane wrote: > Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> writes: >> Our sysadm created the system debian Debian GNU/Linux 12 (bookworm)with >> postgres as user 1000. Now at some point we realized that whenever we >> run a pgsql cluster with another user (I found that after spending two >> good days testing), the above query runs in about 1 second. With user >> postgres 1000 in 30 seconds. As you saw the perf output are completely >> different. > Don't recall details offhand, but in some situations where the calling > SQL user doesn't have permissions to read particular columns, the > planner will not consult statistics for those columns. That can lead > to a different, less optimal plan being used. Maybe something like > that is happening here? Thank you Tom, In all runs of this query the user is postgres (superuser). Besides, this is just a simple query run on a FOREIGN TABLE living in a MS SQL Server. > > regards, tom lane
Re: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4
Dear All false alert, I run strace and it was obvious the slow one was producing huge debug output, while the fast one did not. It was not even a tds_fdw issue. It was freetds. Turned out we have forgotten enabled debugging inside the freetds configuration. You will ask me we did I get this effect of debugging only when run with user "postgres" ? Because /tmp/freetds.log belonged to postgres! I changed owner to nobody:nogroup and dont even need to restart postgres. !! I am sorry for all the noise. Thank you so much for your prompts !! On 11/12/24 23:46, Tomas Vondra wrote: > > On 11/12/24 20:37, Tom Lane wrote: >> Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> writes: >>> Our sysadm created the system debian Debian GNU/Linux 12 (bookworm)with >>> postgres as user 1000. Now at some point we realized that whenever we >>> run a pgsql cluster with another user (I found that after spending two >>> good days testing), the above query runs in about 1 second. With user >>> postgres 1000 in 30 seconds. As you saw the perf output are completely >>> different. >> Don't recall details offhand, but in some situations where the calling >> SQL user doesn't have permissions to read particular columns, the >> planner will not consult statistics for those columns. That can lead >> to a different, less optimal plan being used. Maybe something like >> that is happening here? >> > I don't know, the query is pretty trivial, and the estimates seemed > exactly the same in both cases. And it shouldn't affect how the query > gets planned on the MSSQL side. > > But this seems really strange: > > Planning Time: 14029.724 ms > ... > Execution Time: 15102.803 ms > > It's not about the execution, it's about the planning. I have no idea > why should the planning take this long, except maybe for waiting for a > lock, or something like that. But that's not really consistent with the > profile ... it's weird. > > I'm not familiar with tds_fdw, but I see there are a bunch of table > options [1] that might affect this, namely: > > * use_remote_estimate > * local_tuple_estimate > * row_estimate_method (defaults to 'execute') > > Are you sure these are set to the same value on both machines? > > Wild random guesses: > > 1) Could you try running the query with jit=off? > > 2) Did you run ANALYZE on the foreign table? Could matter when not using > remote estimates (use_remote_estimate=false). > > 3) Could it be some sort of memory pressure/swapping? But that would > look different in the profile, AFAIK. > > > regards > > > [1] https://github.com/tds-fdw/tds_fdw/blob/master/ForeignTableCreation.md >
https://github.com/tds-fdw/tds_fdw/issues/371
All environment on the two pgsql clusters is shared (freetds version, tds_fdw, gcc, llvm). Only thing differs are pgsql versions. The speed on the old pgsql 10.23 is about 10 to 20 times higher than pgsql 16.4 .
Setting client_min_messages TO debug3 does yield identical output on the two systems.
The new pgsql 16.4 shows 100% CPU usage during the query execution.
I know we are pretty much alone with this, but it would be very nice if anyone could help, see smth we are missing or guide us via the right path.