Re: Performance killed with FDW when using CAST. - Mailing list pgsql-admin

From Jorge Torralba
Subject Re: Performance killed with FDW when using CAST.
Date
Msg-id CACut7uTytT_mUmge6x489oy+95dLgxonJaY-8sA7ebxBtppsoQ@mail.gmail.com
Whole thread Raw
In response to Re: Performance killed with FDW when using CAST.  (Ian Barwick <ian.barwick@2ndquadrant.com>)
Responses Re: Performance killed with FDW when using CAST.  (Ron <ronljohnsonjr@gmail.com>)
List pgsql-admin
Thanks for taking the time to look.

Both servers are on ....

                                                             version
----------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 11.2 (Debian 11.2-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit

I have reduced the query to select only two columns for simplicity with the same results.

The table contain about 50 million rows

On the server hosting the table ..... Hiding private data.

alertsdb_recent_events=#  select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 5;
                  id                  | ?column?
--------------------------------------+----------
 ***** | 2617116
 ***** | 2617116
 ***** | 2617116
 ***** | 2617116
 ***** | 2617116
(5 rows)

Time: 82.868 ms


# explain  select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 5;
                                                               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.56..3.77 rows=5 width=48)
   ->  Index Scan using recent_event_account_id_attributes_account_incident_id_idx on recent_events  (cost=0.56..56.31 rows=87 width=48)
         Index Cond: ((account_id = 1) AND (((attributes -> 'account_incident_id'::text))::integer = 2617116))
(3 rows)

Time: 71.907 ms

# explain  analyze select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 5;
                                                                                    QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.56..3.77 rows=5 width=48) (actual time=0.019..0.025 rows=5 loops=1)
   ->  Index Scan using recent_event_account_id_attributes_account_incident_id_idx on recent_events  (cost=0.56..56.31 rows=87 width=48) (actual time=0.018..0.023 rows=5 loops=1)
         Index Cond: ((account_id = 1) AND (((attributes -> 'account_incident_id'::text))::integer = 2617116))
 Planning Time: 0.124 ms
 Execution Time: 0.038 ms
(5 rows)

Time: 80.782 ms


On the server that communicates with the FDW server .....

# explain  select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 5;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Limit  (cost=100.00..426.27 rows=5 width=48)
   ->  Foreign Scan on recent_events  (cost=100.00..6663659.61 rows=102117 width=48)
         Filter: (((attributes -> 'account_incident_id'::text))::integer = 2617116)
(3 rows)

Time: 85.276 ms


# explain  analyze select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 5;
                                                              QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=100.00..426.27 rows=5 width=48) (actual time=21242.087..26387.257 rows=5 loops=1)
   ->  Foreign Scan on recent_events  (cost=100.00..6663703.90 rows=102117 width=48) (actual time=21242.086..26387.252 rows=5 loops=1)
         Filter: (((attributes -> 'account_incident_id'::text))::integer = 2617116)
         Rows Removed by Filter: 724249
 Planning Time: 1.164 ms
 Execution Time: 26387.851 ms
(6 rows)

Time: 26528.113 ms (00:26.528)

The query killer is the ...

AND (attributes -> 'account_incident_id')::integer = 2617116

Run the query this way ...

select id, attributes -> 'account_incident_id' from recent_events where account_id = 1  limit 5;

and the results is only 10ms slower than on the hosting server directly which is what we are expecting.  It's like the casting of the hstore column is just not playing nice.

Thanks


On Tue, Apr 16, 2019 at 6:56 PM Ian Barwick <ian.barwick@2ndquadrant.com> wrote:
On 4/17/19 9:21 AM, Jorge Torralba wrote:
> We setup a FDW server for a large table that has an hstore data type as on eof the coums.
>
> When we run the query directly on the FDW server
>
> select * from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 10;
>
> The response is 2ms . which is awesome.
>
> When we run it from the remote server where we have the server defined, the query never returns.
>
> Our server definition is as follows.
>
> (host 'fdwserver', dbname 'mydb', port '5432', fetch_size '1000', use_remote_estimate 'true', extensions 'hstore')
>
> We have been playing with this for hours to no avail.
>
> Any clues ?

At a guess the FDW is not able to push down the entire query to the FDW server
and might be doing something like fetching all the remote rows and casting
them locally, or something along those lines.

It would help to provide the following info:

- PostgreSQL versions of both servers
- "EXPLAIN" output of the query on both servers
- "EXPLAIN ANALYZE" output of the query executed directly on the FDW server
- approximate number of rows in the "recent_events" table


Regards

Ian Barwick

--
  Ian Barwick                   https://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Thanks,

Jorge Torralba
----------------------------

Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.

pgsql-admin by date:

Previous
From: Ian Barwick
Date:
Subject: Re: Error when select global.service table
Next
From: Ron
Date:
Subject: Re: Performance killed with FDW when using CAST.