Thread: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4

tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4

From
Achilleas Mantzios
Date:

Dear All, 

we have hit a serious performance regression going from 10.23 → 16.4 as far as tds_fdw (MS SQL) FDW is concerned. To cut the long story short, I recreated the good fast “old” (pgsql 10) setup on the same vm as the slow “new” (pgsql 16). Here is the bug report on github :

https://github.com/tds-fdw/tds_fdw/issues/371

All environment on the two pgsql clusters is shared (freetds version, tds_fdw, gcc, llvm). Only thing differs are pgsql versions. The speed on the old pgsql 10.23 is about 10 to 20 times higher than pgsql 16.4 . Setting client_min_messages TO debug3 does yield identical output on the two systems.

The new pgsql 16.4 shows 100% CPU usage during the query execution.

I know we are pretty much alone with this, but it would be very nice if anyone could help, see smth we are missing or guide us via the right path.

On 11/8/24 20:32, Achilleas Mantzios wrote:
> Dear All, 
> 
> we have hit a serious performance regression going from 10.23 → 16.4 as
> far as tds_fdw (MS SQL) FDW is concerned. To cut the long story short, I
> recreated the good fast “old” (pgsql 10) setup on the same vm as the
> slow “new” (pgsql 16). Here is the bug report on github :
> 
> https://github.com/tds-fdw/tds_fdw/issues/371
> 
> All environment on the two pgsql clusters is shared (freetds version,
> tds_fdw, gcc, llvm). Only thing differs are pgsql versions. The speed on
> the old pgsql 10.23 is about 10 to 20 times higher than pgsql 16.4 . |
> Setting client_min_messages TO debug3 does yield identical output on the
> two systems.
> |
> 
> The new pgsql 16.4 shows 100% CPU usage during the query execution.
> 
> I know we are pretty much alone with this, but it would be very nice if
> anyone could help, see smth we are missing or guide us via the right path.
> 

I have little experience with tds_fdw, and can't investigate that
locally. But it might be interesting to compare CPU profiles for the two
(slow and fast) cases. Chances are the difference will be an indication
regarding what got that slower. It might be something in PG or in the
FDW, hard to say.

regards

-- 
Tomas Vondra




Re: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4

From
Achilleas Mantzios
Date:
Στις 9/11/24 12:49, ο/η Tomas Vondra έγραψε:
> On 11/8/24 20:32, Achilleas Mantzios wrote:
>> Dear All,
>>
>> we have hit a serious performance regression going from 10.23 → 16.4 as
>> far as tds_fdw (MS SQL) FDW is concerned. To cut the long story short, I
>> recreated the good fast “old” (pgsql 10) setup on the same vm as the
>> slow “new” (pgsql 16). Here is the bug report on github :
>>
>> https://github.com/tds-fdw/tds_fdw/issues/371
>>
>> All environment on the two pgsql clusters is shared (freetds version,
>> tds_fdw, gcc, llvm). Only thing differs are pgsql versions. The speed on
>> the old pgsql 10.23 is about 10 to 20 times higher than pgsql 16.4 . |
>> Setting client_min_messages TO debug3 does yield identical output on the
>> two systems.
>> |
>>
>> The new pgsql 16.4 shows 100% CPU usage during the query execution.
>>
>> I know we are pretty much alone with this, but it would be very nice if
>> anyone could help, see smth we are missing or guide us via the right path.
>>
> I have little experience with tds_fdw, and can't investigate that
> locally. But it might be interesting to compare CPU profiles for the two
> (slow and fast) cases. Chances are the difference will be an indication
> regarding what got that slower. It might be something in PG or in the
> FDW, hard to say.
Thank you Tomas, do you have linux-perf in mind? How could we get the 
CPU profiles of the two ?  Just in case it was not clear, the tests were 
done on the same VM, at the same time, having the two DBs listening on 
different ports. Kinda spooky from my part to do that on the production 
system, but I wanted everything to be the same (latency to/from the ms 
sql, etc) except the pgsql version, to prove my assumption, and it seems 
pgsql version is what makes the difference (in conjunction of course 
with tds_fdw ) .
>
> regards
>



Re: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4

From
Achilleas Mantzios
Date:
Στις 9/11/24 16:05, ο/η Achilleas Mantzios έγραψε:
>
> Στις 9/11/24 12:49, ο/η Tomas Vondra έγραψε:
>> On 11/8/24 20:32, Achilleas Mantzios wrote:
>>> Dear All,
>>>
>>> we have hit a serious performance regression going from 10.23 → 16.4 as
>>> far as tds_fdw (MS SQL) FDW is concerned. To cut the long story 
>>> short, I
>>> recreated the good fast “old” (pgsql 10) setup on the same vm as the
>>> slow “new” (pgsql 16). Here is the bug report on github :
>>>
>>> https://github.com/tds-fdw/tds_fdw/issues/371
>>>
>>> All environment on the two pgsql clusters is shared (freetds version,
>>> tds_fdw, gcc, llvm). Only thing differs are pgsql versions. The 
>>> speed on
>>> the old pgsql 10.23 is about 10 to 20 times higher than pgsql 16.4 . |
>>> Setting client_min_messages TO debug3 does yield identical output on 
>>> the
>>> two systems.
>>> |
>>>
>>> The new pgsql 16.4 shows 100% CPU usage during the query execution.
>>>
>>> I know we are pretty much alone with this, but it would be very nice if
>>> anyone could help, see smth we are missing or guide us via the right 
>>> path.
>>>
>> I have little experience with tds_fdw, and can't investigate that
>> locally. But it might be interesting to compare CPU profiles for the two
>> (slow and fast) cases. Chances are the difference will be an indication
>> regarding what got that slower. It might be something in PG or in the
>> FDW, hard to say.
> Thank you Tomas, do you have linux-perf in mind? How could we get the 
> CPU profiles of the two ?  Just in case it was not clear, the tests 
> were done on the same VM, at the same time, having the two DBs 
> listening on different ports. Kinda spooky from my part to do that on 
> the production system, but I wanted everything to be the same (latency 
> to/from the ms sql, etc) except the pgsql version, to prove my 
> assumption, and it seems pgsql version is what makes the difference 
> (in conjunction of course with tds_fdw ) .
Meaning there are CPP directives in the tds_fdw source making it behave 
differently according to the Postgresql version. The two tds_fdw's have 
the same version.
>>
>> regards
>>
>
>




On 11/9/24 15:05, Achilleas Mantzios wrote:
> 
> Στις 9/11/24 12:49, ο/η Tomas Vondra έγραψε:
>> On 11/8/24 20:32, Achilleas Mantzios wrote:
>>> Dear All,
>>>
>>> we have hit a serious performance regression going from 10.23 → 16.4 as
>>> far as tds_fdw (MS SQL) FDW is concerned. To cut the long story short, I
>>> recreated the good fast “old” (pgsql 10) setup on the same vm as the
>>> slow “new” (pgsql 16). Here is the bug report on github :
>>>
>>> https://github.com/tds-fdw/tds_fdw/issues/371
>>>
>>> All environment on the two pgsql clusters is shared (freetds version,
>>> tds_fdw, gcc, llvm). Only thing differs are pgsql versions. The speed on
>>> the old pgsql 10.23 is about 10 to 20 times higher than pgsql 16.4 . |
>>> Setting client_min_messages TO debug3 does yield identical output on the
>>> two systems.
>>> |
>>>
>>> The new pgsql 16.4 shows 100% CPU usage during the query execution.
>>>
>>> I know we are pretty much alone with this, but it would be very nice if
>>> anyone could help, see smth we are missing or guide us via the right
>>> path.
>>>
>> I have little experience with tds_fdw, and can't investigate that
>> locally. But it might be interesting to compare CPU profiles for the two
>> (slow and fast) cases. Chances are the difference will be an indication
>> regarding what got that slower. It might be something in PG or in the
>> FDW, hard to say.
> Thank you Tomas, do you have linux-perf in mind? How could we get the
> CPU profiles of the two ?  Just in case it was not clear, the tests were
> done on the same VM, at the same time, having the two DBs listening on
> different ports. Kinda spooky from my part to do that on the production
> system, but I wanted everything to be the same (latency to/from the ms
> sql, etc) except the pgsql version, to prove my assumption, and it seems
> pgsql version is what makes the difference (in conjunction of course
> with tds_fdw ) .
>>

Yes, I mean linux-perf. There's a wikipage with some basic instructions:

  https://wiki.postgresql.org/wiki/Profiling_with_perf

But in short, I'd do this:

1) use pg_backend_pid() to get PID of the backend

    SELECT pg_backend_pid();

2) start recording profile for that PID, say for 1 minute

    perf record -g -p $PID -- sleep 60

3) run the query in a loop

    SELECT count(*) as foo FROM mssql_bdynacom."ACDOC" \watch 1

4) once the recording stops, generate report

    perf report > report.txt


Do this for both versions, share the output txt files. You may need to
install additional packages with debug symbols to get better profiles,
and stuff like that.

Also, don't forget to share the explain plans.

regard

-- 
Tomas Vondra




Re: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4

From
Achilleas Mantzios
Date:
Στις 9/11/24 17:41, ο/η Tomas Vondra έγραψε:
>
> On 11/9/24 15:05, Achilleas Mantzios wrote:
>> Στις 9/11/24 12:49, ο/η Tomas Vondra έγραψε:
>>> On 11/8/24 20:32, Achilleas Mantzios wrote:
>>>> Dear All,
>>>>
>>>> we have hit a serious performance regression going from 10.23 → 16.4 as
>>>> far as tds_fdw (MS SQL) FDW is concerned. To cut the long story short, I
>>>> recreated the good fast “old” (pgsql 10) setup on the same vm as the
>>>> slow “new” (pgsql 16). Here is the bug report on github :
>>>>
>>>> https://github.com/tds-fdw/tds_fdw/issues/371
>>>>
>>>> All environment on the two pgsql clusters is shared (freetds version,
>>>> tds_fdw, gcc, llvm). Only thing differs are pgsql versions. The speed on
>>>> the old pgsql 10.23 is about 10 to 20 times higher than pgsql 16.4 . |
>>>> Setting client_min_messages TO debug3 does yield identical output on the
>>>> two systems.
>>>> |
>>>>
>>>> The new pgsql 16.4 shows 100% CPU usage during the query execution.
>>>>
>>>> I know we are pretty much alone with this, but it would be very nice if
>>>> anyone could help, see smth we are missing or guide us via the right
>>>> path.
>>>>
>>> I have little experience with tds_fdw, and can't investigate that
>>> locally. But it might be interesting to compare CPU profiles for the two
>>> (slow and fast) cases. Chances are the difference will be an indication
>>> regarding what got that slower. It might be something in PG or in the
>>> FDW, hard to say.
>> Thank you Tomas, do you have linux-perf in mind? How could we get the
>> CPU profiles of the two ?  Just in case it was not clear, the tests were
>> done on the same VM, at the same time, having the two DBs listening on
>> different ports. Kinda spooky from my part to do that on the production
>> system, but I wanted everything to be the same (latency to/from the ms
>> sql, etc) except the pgsql version, to prove my assumption, and it seems
>> pgsql version is what makes the difference (in conjunction of course
>> with tds_fdw ) .
> Yes, I mean linux-perf. There's a wikipage with some basic instructions:
>
>    https://wiki.postgresql.org/wiki/Profiling_with_perf
>
> But in short, I'd do this:
>
> 1) use pg_backend_pid() to get PID of the backend
>
>      SELECT pg_backend_pid();
>
> 2) start recording profile for that PID, say for 1 minute
>
>      perf record -g -p $PID -- sleep 60
>
> 3) run the query in a loop
>
>      SELECT count(*) as foo FROM mssql_bdynacom."ACDOC" \watch 1
>
> 4) once the recording stops, generate report
>
>      perf report > report.txt
>
>
> Do this for both versions, share the output txt files. You may need to
> install additional packages with debug symbols to get better profiles,
> and stuff like that.
>
> Also, don't forget to share the explain plans.
Thank you Tomas, as soon as I get my hands on the new VM I will do as 
you suggested and report back.
>
> regard
>



On 11/12/24 11:37, Achilleas Mantzios - cloud wrote:
>
> ...
> 
> We run perf on both systems for 90 seconds to make sure that it included
> at least two runs of the slow system, and we attach both outputs. We run
> perf as :
> 
> sudo perf record -g -p <PID> -- sleep 90
> 
> it strikes me that we dont find similarities between the two, but we are
> new to perf.
> 

Not sure, but it seems the slow profile has a lot of writes. Not sure
why. Do both instances have the same work_mem value / available memory,
storatge?

Maybe EXPLAIN VERBOSE would show if the remote query is the same ...

Maybe try setting log_temp_files=0 before running the query. Writing the
foreign scan result set into a temp file could be happening on one of
the machines only.


regards

-- 
Tomas Vondra




Re: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4

From
Achilleas Mantzios
Date:


Στις 12/11/24 15:17, ο/η Tomas Vondra έγραψε:
On 11/12/24 11:37, Achilleas Mantzios - cloud wrote:
...

We run perf on both systems for 90 seconds to make sure that it included
at least two runs of the slow system, and we attach both outputs. We run
perf as :

sudo perf record -g -p <PID> -- sleep 90

it strikes me that we dont find similarities between the two, but we are
new to perf.

Not sure, but it seems the slow profile has a lot of writes. Not sure
why. Do both instances have the same work_mem value / available memory,
storatge?
In fact the plain vanilla installation, with all the defaults, runs this fast, please read further.

Maybe EXPLAIN VERBOSE would show if the remote query is the same ...

Maybe try setting log_temp_files=0 before running the query. Writing the
foreign scan result set into a temp file could be happening on one of
the machines only.

No temp files, the remote query shown by EXPLAIN (... VERBOSE) and ms sql it self is :

Remote query: SELECT NULL FROM [db_ro_non_delosnav].[ACDOC]

Aggregate  (cost=168359742.33..168359742.34 rows=1 width=8) (actual time=15100.248..15100.250 rows=1 loops=1)
  Output: count(*)
  ->  Foreign Scan on mssql_bdynacom."ACDOC"  (cost=200.00..168355533.86 rows=1683386 width=100) (actual time=3.783..14967.029 rows=1683386 loops=1)
        Output: "ID", "DOC_TYPE", "PARAST", "TRNS_DATE", "VIA_MNG", "VIA_MNG_ID", "FL_UPD", "NOTES", "REM_DATE", "REM_CLS", "OPN_CLS", "ACSITE_ID", "USERS_ID", "LAST_UPD_DA
TE", "LAST_UPD_USERS_ID", "OTH_TYPE", "OTH_TYPE_DESC", "OTH_APPL", "FL_PRN", "CREATE_DATE", "LOG_NOTE", "PER_ID", "ACDEPT_ID", "FL_DLT", "CMP_ID", "INTPER_ID", "FL_INT_TRNCL
S", "FL_ACTV", "OLD_ACDEPT_ID", "OLD_USERS_ID", "OLD_TRNS_DATE", "OLD_FL_UPD_DATE", "FL_FNLZ", "ERROR_NOTES", "FL_ATTCH", "OPDOC_ID", "FL_ONHOLD", "ONHOLD_NOTES", "FL_RVS",
"RVS_DOC_ID", "FL_REV", "REV_NO", "REV_DOC_ID", "REV_NOTES", "REV_DATE", "REV_USER_ID", "FL_PROT", "INTGR_DOC_ID", "INTGR_DOC_TP", "P_DOC_ID", "FL_OPN", "FL_MDL", "USERUPDAC
C_ID", "DATE_UPDACC", "USERIN_ID", "USERUPD_ID", "DATE_IN", "DATE_UPD", "DOC_SUBTYPE", "DOC_REF", "FRTRN_TYPE", "FL_ACC", "ATCH_DIR", "ATCH_REM", "KEYID", "OTH_APPL1", "RGST
R_ID", "ACDOC_LIST_NO", "EXEC_DATE", "EXEC_USER_ID", "EXEC_NOTES", "CNL_EXEC_DATE", "CNL_EXEC_USER_ID", "FL_EXEC", "CNL_EXEC_NOTES", "LOG_NOTE1", "USERREUPDACC_ID", "DATE_RE
UPDACC", "P_ACDOC_ID", "FL_AUTHORITY_APRV", "FL_APPROVE_TYPE", "FL_NOTINCLUDE_INAUTH", "FL_NO_OTHERCURRENCY", "FL_WFSTATUS", "FL_APRV", "USRACTV_LOG_ID", "USERIN_DNAME", "US
ERUPD_DNAME", "UROLEIN_ID", "UROLEIN_NAME", "UDEPTIN_ID", "UDEPTIN_NAME", "FL_LOAD_5805", "UROLEUPD_ID", "UROLEUPD_NAME", "CNTRY_ID", fl_togetback, "FULLPATH", "FL_UPD_OLD",
"FL_REOINTG", "FL_OINTGR_RESEND"
        Remote query: SELECT NULL FROM [db_ro_non_delosnav].[ACDOC]
Query Identifier: 6812542821581303630
Planning Time: 14029.724 ms
JIT:
  Functions: 2
  Options: Inlining true, Optimization true, Expressions true, Deforming true
  Timing: Generation 0.168 ms, Inlining 2.087 ms, Optimization 3.301 ms, Emission 3.564 ms, Total 9.121 ms
Execution Time: 15102.803 ms
(12 rows)

I gather this is how it computes the plan.

I think we have narrowed down the problem, and this is extremely strange :

It is not a matter of pgsql version 10 or 16, it is a matter of the postgres user, the user that owns the data dir(s) and the user of the postgres process. We  reproduced both the problem and the solution with all combinations of versions.

To sum it up :

Our sysadm created the system debian Debian GNU/Linux 12 (bookworm) with postgres as user 1000. Now at some point we realized that whenever we run a pgsql cluster with another user (I found that after spending two good days testing), the above query runs in about 1 second. With user postgres 1000 in 30 seconds. As you saw the perf output are completely different.

On the test VM, we removed the postgres user, recreated with uid=1003, chown -R all the mount points + table spaces, started postgres with the new postgres user , and bingo . The chown -R on the 5TB is instant... just saying ...

We are puzzled what can be causing this. Tomorrow we dig into GDB , *trace and the like.

If this rings any bells we would be more than grateful to know.

Also, the worse, is the suspicion that maybe our whole infra performance is affected. We hope it is only free-tds and tds_fdw .



regards

Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> writes:
> Our sysadm created the system debian Debian GNU/Linux 12 (bookworm)with 
> postgres as user 1000. Now at some point we realized that whenever we 
> run a pgsql cluster with another user (I found that after spending two 
> good days testing), the above query runs in about 1 second. With user 
> postgres 1000 in 30 seconds. As you saw the perf output are completely 
> different.

Don't recall details offhand, but in some situations where the calling
SQL user doesn't have permissions to read particular columns, the
planner will not consult statistics for those columns.  That can lead
to a different, less optimal plan being used.  Maybe something like
that is happening here?

            regards, tom lane




On 11/12/24 20:37, Tom Lane wrote:
> Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> writes:
>> Our sysadm created the system debian Debian GNU/Linux 12 (bookworm)with 
>> postgres as user 1000. Now at some point we realized that whenever we 
>> run a pgsql cluster with another user (I found that after spending two 
>> good days testing), the above query runs in about 1 second. With user 
>> postgres 1000 in 30 seconds. As you saw the perf output are completely 
>> different.
> 
> Don't recall details offhand, but in some situations where the calling
> SQL user doesn't have permissions to read particular columns, the
> planner will not consult statistics for those columns.  That can lead
> to a different, less optimal plan being used.  Maybe something like
> that is happening here?
> 

I don't know, the query is pretty trivial, and the estimates seemed
exactly the same in both cases. And it shouldn't affect how the query
gets planned on the MSSQL side.

But this seems really strange:

  Planning Time: 14029.724 ms
  ...
  Execution Time: 15102.803 ms

It's not about the execution, it's about the planning. I have no idea
why should the planning take this long, except maybe for waiting for a
lock, or something like that. But that's not really consistent with the
profile ... it's weird.

I'm not familiar with tds_fdw, but I see there are a bunch of table
options [1] that might affect this, namely:

* use_remote_estimate
* local_tuple_estimate
* row_estimate_method (defaults to 'execute')

Are you sure these are set to the same value on both machines?

Wild random guesses:

1) Could you try running the query with jit=off?

2) Did you run ANALYZE on the foreign table? Could matter when not using
remote estimates (use_remote_estimate=false).

3) Could it be some sort of memory pressure/swapping? But that would
look different in the profile, AFAIK.


regards


[1] https://github.com/tds-fdw/tds_fdw/blob/master/ForeignTableCreation.md

-- 
Tomas Vondra




Re: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4

From
Achilleas Mantzios - cloud
Date:
On 11/12/24 21:37, Tom Lane wrote:
> Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> writes:
>> Our sysadm created the system debian Debian GNU/Linux 12 (bookworm)with
>> postgres as user 1000. Now at some point we realized that whenever we
>> run a pgsql cluster with another user (I found that after spending two
>> good days testing), the above query runs in about 1 second. With user
>> postgres 1000 in 30 seconds. As you saw the perf output are completely
>> different.
> Don't recall details offhand, but in some situations where the calling
> SQL user doesn't have permissions to read particular columns, the
> planner will not consult statistics for those columns.  That can lead
> to a different, less optimal plan being used.  Maybe something like
> that is happening here?

Thank you Tom,

In all runs of this query the user is postgres (superuser). Besides, 
this is just a simple query run on a FOREIGN TABLE living in a MS SQL 
Server.

>
>             regards, tom lane



Re: tds_fdw : Severe performance degradation from postgresql 10.23 to 16.4

From
Achilleas Mantzios - cloud
Date:
Dear All

false alert, I run strace and it was obvious the slow one was producing 
huge debug output, while the fast one did not. It was not even a tds_fdw 
issue. It was freetds. Turned out we have forgotten enabled debugging 
inside the freetds configuration. You will ask me we did I get this 
effect of debugging only when run with user "postgres" ? Because 
/tmp/freetds.log belonged to postgres!

I changed owner to nobody:nogroup and dont even need to restart postgres. !!

I am sorry for all the noise.

Thank you so much for your prompts !!

On 11/12/24 23:46, Tomas Vondra wrote:
>
> On 11/12/24 20:37, Tom Lane wrote:
>> Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> writes:
>>> Our sysadm created the system debian Debian GNU/Linux 12 (bookworm)with
>>> postgres as user 1000. Now at some point we realized that whenever we
>>> run a pgsql cluster with another user (I found that after spending two
>>> good days testing), the above query runs in about 1 second. With user
>>> postgres 1000 in 30 seconds. As you saw the perf output are completely
>>> different.
>> Don't recall details offhand, but in some situations where the calling
>> SQL user doesn't have permissions to read particular columns, the
>> planner will not consult statistics for those columns.  That can lead
>> to a different, less optimal plan being used.  Maybe something like
>> that is happening here?
>>
> I don't know, the query is pretty trivial, and the estimates seemed
> exactly the same in both cases. And it shouldn't affect how the query
> gets planned on the MSSQL side.
>
> But this seems really strange:
>
>    Planning Time: 14029.724 ms
>    ...
>    Execution Time: 15102.803 ms
>
> It's not about the execution, it's about the planning. I have no idea
> why should the planning take this long, except maybe for waiting for a
> lock, or something like that. But that's not really consistent with the
> profile ... it's weird.
>
> I'm not familiar with tds_fdw, but I see there are a bunch of table
> options [1] that might affect this, namely:
>
> * use_remote_estimate
> * local_tuple_estimate
> * row_estimate_method (defaults to 'execute')
>
> Are you sure these are set to the same value on both machines?
>
> Wild random guesses:
>
> 1) Could you try running the query with jit=off?
>
> 2) Did you run ANALYZE on the foreign table? Could matter when not using
> remote estimates (use_remote_estimate=false).
>
> 3) Could it be some sort of memory pressure/swapping? But that would
> look different in the profile, AFAIK.
>
>
> regards
>
>
> [1] https://github.com/tds-fdw/tds_fdw/blob/master/ForeignTableCreation.md
>