Thread: Performance killed with FDW when using CAST.
We setup a FDW server for a large table that has an hstore data type as on eof the coums.
When we run the query directly on the FDW server
select * from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 10;
The response is 2ms . which is awesome.
When we run it from the remote server where we have the server defined, the query never returns.
Our server definition is as follows.
(host 'fdwserver', dbname 'mydb', port '5432', fetch_size '1000', use_remote_estimate 'true', extensions 'hstore')
We have been playing with this for hours to no avail.
Any clues ?
Thanks
--
Thanks,
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
On 4/17/19 9:21 AM, Jorge Torralba wrote: > We setup a FDW server for a large table that has an hstore data type as on eof the coums. > > When we run the query directly on the FDW server > > select * from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 10; > > The response is 2ms . which is awesome. > > When we run it from the remote server where we have the server defined, the query never returns. > > Our server definition is as follows. > > (host 'fdwserver', dbname 'mydb', port '5432', fetch_size '1000', use_remote_estimate 'true', extensions 'hstore') > > We have been playing with this for hours to no avail. > > Any clues ? At a guess the FDW is not able to push down the entire query to the FDW server and might be doing something like fetching all the remote rows and casting them locally, or something along those lines. It would help to provide the following info: - PostgreSQL versions of both servers - "EXPLAIN" output of the query on both servers - "EXPLAIN ANALYZE" output of the query executed directly on the FDW server - approximate number of rows in the "recent_events" table Regards Ian Barwick -- Ian Barwick https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Thanks for taking the time to look.
Both servers are on ....
version
----------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 11.2 (Debian 11.2-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
I have reduced the query to select only two columns for simplicity with the same results.
The table contain about 50 million rows
On the server hosting the table ..... Hiding private data.
alertsdb_recent_events=# select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 5;
id | ?column?
--------------------------------------+----------
***** | 2617116
***** | 2617116
***** | 2617116
***** | 2617116
***** | 2617116
(5 rows)
Time: 82.868 ms
# explain select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 5;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.56..3.77 rows=5 width=48)
-> Index Scan using recent_event_account_id_attributes_account_incident_id_idx on recent_events (cost=0.56..56.31 rows=87 width=48)
Index Cond: ((account_id = 1) AND (((attributes -> 'account_incident_id'::text))::integer = 2617116))
(3 rows)
Time: 71.907 ms
# explain analyze select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 5;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.56..3.77 rows=5 width=48) (actual time=0.019..0.025 rows=5 loops=1)
-> Index Scan using recent_event_account_id_attributes_account_incident_id_idx on recent_events (cost=0.56..56.31 rows=87 width=48) (actual time=0.018..0.023 rows=5 loops=1)
Index Cond: ((account_id = 1) AND (((attributes -> 'account_incident_id'::text))::integer = 2617116))
Planning Time: 0.124 ms
Execution Time: 0.038 ms
(5 rows)
Time: 80.782 ms
On the server that communicates with the FDW server .....
# explain select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 5;
QUERY PLAN
-------------------------------------------------------------------------------------
Limit (cost=100.00..426.27 rows=5 width=48)
-> Foreign Scan on recent_events (cost=100.00..6663659.61 rows=102117 width=48)
Filter: (((attributes -> 'account_incident_id'::text))::integer = 2617116)
(3 rows)
Time: 85.276 ms
# explain analyze select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 5;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=100.00..426.27 rows=5 width=48) (actual time=21242.087..26387.257 rows=5 loops=1)
-> Foreign Scan on recent_events (cost=100.00..6663703.90 rows=102117 width=48) (actual time=21242.086..26387.252 rows=5 loops=1)
Filter: (((attributes -> 'account_incident_id'::text))::integer = 2617116)
Rows Removed by Filter: 724249
Planning Time: 1.164 ms
Execution Time: 26387.851 ms
(6 rows)
Time: 26528.113 ms (00:26.528)
The query killer is the ...
AND (attributes -> 'account_incident_id')::integer = 2617116
Run the query this way ...
select id, attributes -> 'account_incident_id' from recent_events where account_id = 1 limit 5;
and the results is only 10ms slower than on the hosting server directly which is what we are expecting. It's like the casting of the hstore column is just not playing nice.
Thanks
On Tue, Apr 16, 2019 at 6:56 PM Ian Barwick <ian.barwick@2ndquadrant.com> wrote:
On 4/17/19 9:21 AM, Jorge Torralba wrote:
> We setup a FDW server for a large table that has an hstore data type as on eof the coums.
>
> When we run the query directly on the FDW server
>
> select * from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 10;
>
> The response is 2ms . which is awesome.
>
> When we run it from the remote server where we have the server defined, the query never returns.
>
> Our server definition is as follows.
>
> (host 'fdwserver', dbname 'mydb', port '5432', fetch_size '1000', use_remote_estimate 'true', extensions 'hstore')
>
> We have been playing with this for hours to no avail.
>
> Any clues ?
At a guess the FDW is not able to push down the entire query to the FDW server
and might be doing something like fetching all the remote rows and casting
them locally, or something along those lines.
It would help to provide the following info:
- PostgreSQL versions of both servers
- "EXPLAIN" output of the query on both servers
- "EXPLAIN ANALYZE" output of the query executed directly on the FDW server
- approximate number of rows in the "recent_events" table
Regards
Ian Barwick
--
Ian Barwick https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Thanks,
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
On 4/16/19 9:40 PM, Jorge Torralba wrote: > Thanks for taking the time to look. > > Both servers are on .... > > version > ---------------------------------------------------------------------------------------------------------------------------------- > PostgreSQL 11.2 (Debian 11.2-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled > by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit > > I have reduced the query to select only two columns for simplicity with > the same results. > > The table contain about 50 million rows > > On the server hosting the table ..... Hiding private data. > > alertsdb_recent_events=# select id, attributes -> 'account_incident_id' > from recent_events where account_id = 1 AND (attributes -> > 'account_incident_id')::integer = 2617116 limit 5; > id | ?column? > --------------------------------------+---------- > ***** | 2617116 > ***** | 2617116 > ***** | 2617116 > ***** | 2617116 > ***** | 2617116 > (5 rows) > > Time: 82.868 ms > > > # explain select id, attributes -> 'account_incident_id' from > recent_events where account_id = 1 AND (attributes -> > 'account_incident_id')::integer = 2617116 limit 5; > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.56..3.77 rows=5 width=48) > -> Index Scan using > recent_event_account_id_attributes_account_incident_id_idx on > recent_events (cost=0.56..56.31 rows=87 width=48) > Index Cond: ((account_id = 1) AND (((attributes -> > 'account_incident_id'::text))::integer = 2617116)) > (3 rows) > > Time: 71.907 ms > > # explain analyze select id, attributes -> 'account_incident_id' from > recent_events where account_id = 1 AND (attributes -> > 'account_incident_id')::integer = 2617116 limit 5; > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.56..3.77 rows=5 width=48) (actual time=0.019..0.025 rows=5 > loops=1) > -> Index Scan using > recent_event_account_id_attributes_account_incident_id_idx on > recent_events (cost=0.56..56.31 rows=87 width=48) (actual > time=0.018..0.023 rows=5 loops=1) > Index Cond: ((account_id = 1) AND (((attributes -> > 'account_incident_id'::text))::integer = 2617116)) > Planning Time: 0.124 ms > Execution Time: 0.038 ms > (5 rows) > > Time: 80.782 ms > > > On the server that communicates with the FDW server ..... > > # explain select id, attributes -> 'account_incident_id' from > recent_events where account_id = 1 AND (attributes -> > 'account_incident_id')::integer = 2617116 limit 5; > QUERY PLAN > ------------------------------------------------------------------------------------- > Limit (cost=100.00..426.27 rows=5 width=48) > -> Foreign Scan on recent_events (cost=100.00..6663659.61 rows=102117 > width=48) > Filter: (((attributes -> 'account_incident_id'::text))::integer = > 2617116) > (3 rows) > > Time: 85.276 ms > > > # explain analyze select id, attributes -> 'account_incident_id' from > recent_events where account_id = 1 AND (attributes -> > 'account_incident_id')::integer = 2617116 limit 5; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=100.00..426.27 rows=5 width=48) (actual > time=21242.087..26387.257 rows=5 loops=1) > -> Foreign Scan on recent_events (cost=100.00..6663703.90 rows=102117 > width=48) (actual time=21242.086..26387.252 rows=5 loops=1) > Filter: (((attributes -> 'account_incident_id'::text))::integer = > 2617116) > Rows Removed by Filter: 724249 > Planning Time: 1.164 ms > Execution Time: 26387.851 ms > (6 rows) > > Time: 26528.113 ms (00:26.528) > > The query killer is the ... > > AND (attributes -> 'account_incident_id')::integer = 2617116 > > Run the query this way ... > > select id, attributes -> 'account_incident_id' from recent_events where > account_id = 1 limit 5; > > and the results is only 10ms slower than on the hosting server directly > which is what we are expecting. It's like the casting of the hstore > column is just not playing nice. Casting the left side of a predicate is not recommended. What if you cast 2617166 to be the same type as attributes -> 'account_incident_id'? -- Angular momentum makes the world go 'round.
I tried without casting and still the same problem on the server accessing remote data.
On the server hosting the data, even with casting the query executes in 78ms.
fty, I have run analyze on both the servers as well.
Thanks
JT
On Tue, Apr 16, 2019 at 8:00 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 4/16/19 9:40 PM, Jorge Torralba wrote:
> Thanks for taking the time to look.
>
> Both servers are on ....
>
> version
> ----------------------------------------------------------------------------------------------------------------------------------
> PostgreSQL 11.2 (Debian 11.2-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled
> by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
>
> I have reduced the query to select only two columns for simplicity with
> the same results.
>
> The table contain about 50 million rows
>
> On the server hosting the table ..... Hiding private data.
>
> alertsdb_recent_events=# select id, attributes -> 'account_incident_id'
> from recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
> id | ?column?
> --------------------------------------+----------
> ***** | 2617116
> ***** | 2617116
> ***** | 2617116
> ***** | 2617116
> ***** | 2617116
> (5 rows)
>
> Time: 82.868 ms
>
>
> # explain select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=0.56..3.77 rows=5 width=48)
> -> Index Scan using
> recent_event_account_id_attributes_account_incident_id_idx on
> recent_events (cost=0.56..56.31 rows=87 width=48)
> Index Cond: ((account_id = 1) AND (((attributes ->
> 'account_incident_id'::text))::integer = 2617116))
> (3 rows)
>
> Time: 71.907 ms
>
> # explain analyze select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=0.56..3.77 rows=5 width=48) (actual time=0.019..0.025 rows=5
> loops=1)
> -> Index Scan using
> recent_event_account_id_attributes_account_incident_id_idx on
> recent_events (cost=0.56..56.31 rows=87 width=48) (actual
> time=0.018..0.023 rows=5 loops=1)
> Index Cond: ((account_id = 1) AND (((attributes ->
> 'account_incident_id'::text))::integer = 2617116))
> Planning Time: 0.124 ms
> Execution Time: 0.038 ms
> (5 rows)
>
> Time: 80.782 ms
>
>
> On the server that communicates with the FDW server .....
>
> # explain select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
> QUERY PLAN
> -------------------------------------------------------------------------------------
> Limit (cost=100.00..426.27 rows=5 width=48)
> -> Foreign Scan on recent_events (cost=100.00..6663659.61 rows=102117
> width=48)
> Filter: (((attributes -> 'account_incident_id'::text))::integer =
> 2617116)
> (3 rows)
>
> Time: 85.276 ms
>
>
> # explain analyze select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
> QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=100.00..426.27 rows=5 width=48) (actual
> time=21242.087..26387.257 rows=5 loops=1)
> -> Foreign Scan on recent_events (cost=100.00..6663703.90 rows=102117
> width=48) (actual time=21242.086..26387.252 rows=5 loops=1)
> Filter: (((attributes -> 'account_incident_id'::text))::integer =
> 2617116)
> Rows Removed by Filter: 724249
> Planning Time: 1.164 ms
> Execution Time: 26387.851 ms
> (6 rows)
>
> Time: 26528.113 ms (00:26.528)
>
> The query killer is the ...
>
> AND (attributes -> 'account_incident_id')::integer = 2617116
>
> Run the query this way ...
>
> select id, attributes -> 'account_incident_id' from recent_events where
> account_id = 1 limit 5;
>
> and the results is only 10ms slower than on the hosting server directly
> which is what we are expecting. It's like the casting of the hstore
> column is just not playing nice.
Casting the left side of a predicate is not recommended. What if you cast
2617166 to be the same type as attributes -> 'account_incident_id'?
--
Angular momentum makes the world go 'round.
Thanks,
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
I tried without casting and still the same problem on the server accessing remote data.
On the server hosting the data, even with casting the query executes in 78ms.
fty, I have run analyze on both the servers as well.
Thanks
JT
On Tue, Apr 16, 2019 at 8:00 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 4/16/19 9:40 PM, Jorge Torralba wrote:
> Thanks for taking the time to look.
>
> Both servers are on ....
>
> version
> ----------------------------------------------------------------------------------------------------------------------------------
> PostgreSQL 11.2 (Debian 11.2-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled
> by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
>
> I have reduced the query to select only two columns for simplicity with
> the same results.
>
> The table contain about 50 million rows
>
> On the server hosting the table ..... Hiding private data.
>
> alertsdb_recent_events=# select id, attributes -> 'account_incident_id'
> from recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
> id | ?column?
> --------------------------------------+----------
> ***** | 2617116
> ***** | 2617116
> ***** | 2617116
> ***** | 2617116
> ***** | 2617116
> (5 rows)
>
> Time: 82.868 ms
>
>
> # explain select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=0.56..3.77 rows=5 width=48)
> -> Index Scan using
> recent_event_account_id_attributes_account_incident_id_idx on
> recent_events (cost=0.56..56.31 rows=87 width=48)
> Index Cond: ((account_id = 1) AND (((attributes ->
> 'account_incident_id'::text))::integer = 2617116))
> (3 rows)
>
> Time: 71.907 ms
>
> # explain analyze select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=0.56..3.77 rows=5 width=48) (actual time=0.019..0.025 rows=5
> loops=1)
> -> Index Scan using
> recent_event_account_id_attributes_account_incident_id_idx on
> recent_events (cost=0.56..56.31 rows=87 width=48) (actual
> time=0.018..0.023 rows=5 loops=1)
> Index Cond: ((account_id = 1) AND (((attributes ->
> 'account_incident_id'::text))::integer = 2617116))
> Planning Time: 0.124 ms
> Execution Time: 0.038 ms
> (5 rows)
>
> Time: 80.782 ms
>
>
> On the server that communicates with the FDW server .....
>
> # explain select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
> QUERY PLAN
> -------------------------------------------------------------------------------------
> Limit (cost=100.00..426.27 rows=5 width=48)
> -> Foreign Scan on recent_events (cost=100.00..6663659.61 rows=102117
> width=48)
> Filter: (((attributes -> 'account_incident_id'::text))::integer =
> 2617116)
> (3 rows)
>
> Time: 85.276 ms
>
>
> # explain analyze select id, attributes -> 'account_incident_id' from
> recent_events where account_id = 1 AND (attributes ->
> 'account_incident_id')::integer = 2617116 limit 5;
> QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=100.00..426.27 rows=5 width=48) (actual
> time=21242.087..26387.257 rows=5 loops=1)
> -> Foreign Scan on recent_events (cost=100.00..6663703.90 rows=102117
> width=48) (actual time=21242.086..26387.252 rows=5 loops=1)
> Filter: (((attributes -> 'account_incident_id'::text))::integer =
> 2617116)
> Rows Removed by Filter: 724249
> Planning Time: 1.164 ms
> Execution Time: 26387.851 ms
> (6 rows)
>
> Time: 26528.113 ms (00:26.528)
>
> The query killer is the ...
>
> AND (attributes -> 'account_incident_id')::integer = 2617116
>
> Run the query this way ...
>
> select id, attributes -> 'account_incident_id' from recent_events where
> account_id = 1 limit 5;
>
> and the results is only 10ms slower than on the hosting server directly
> which is what we are expecting. It's like the casting of the hstore
> column is just not playing nice.
Casting the left side of a predicate is not recommended. What if you cast
2617166 to be the same type as attributes -> 'account_incident_id'?
--
Angular momentum makes the world go 'round.
Thanks,
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
Jorge Torralba wrote: > We setup a FDW server for a large table that has an hstore data type as on eof the coums. > > When we run the query directly on the FDW server > > select * from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 10; > > The response is 2ms . which is awesome. > > When we run it from the remote server where we have the server defined, the query never returns. > > Our server definition is as follows. > > (host 'fdwserver', dbname 'mydb', port '5432', fetch_size '1000', use_remote_estimate 'true', extensions 'hstore') After debugging into this, it seems that the hstore operator -> cannot be pushed down because of collation problems. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Laurenz Albe schrieb am 17.04.2019 um 07:03: >> We setup a FDW server for a large table that has an hstore data type as on eof the coums. >> >> When we run the query directly on the FDW server >> >> select * from recent_events where account_id = 1 AND (attributes -> 'account_incident_id')::integer = 2617116 limit 10; >> >> The response is 2ms . which is awesome. >> >> When we run it from the remote server where we have the server defined, the query never returns. >> >> Our server definition is as follows. >> >> (host 'fdwserver', dbname 'mydb', port '5432', fetch_size '1000', use_remote_estimate 'true', extensions 'hstore') > > After debugging into this, it seems that the hstore operator -> cannot > be pushed down because of collation problems. Do you happen to know if the JSONB operator -> (or ->>) can be pushed down? Although I realize it would be a major change in the database design, maybe that would be a solution for Jorge. Thomas
Thomas Kellerer <spam_eater@gmx.net> writes: > Laurenz Albe schrieb am 17.04.2019 um 07:03: >> After debugging into this, it seems that the hstore operator -> cannot >> be pushed down because of collation problems. > Do you happen to know if the JSONB operator -> (or ->>) can be pushed down? A bit of experimentation says that jsonb -> integer can be pushed down, but not any of the variants involving a text fieldname or result. Presumably this is because of the heuristic that says not to push down a collation that didn't arise from the remote column. jsonb -> text isn't really collation-sensitive, of course, but postgres_fdw has no good way to know that, since the core code (outside of that operator itself) doesn't know it either. The assumption is that any function with at least one input of a collatable type is collation-sensitive. Here you're getting a default collation from the text literal, and postgres_fdw doesn't want to assume that the remote end would choose the same collation. regards, tom lane
Thanks for the info. Is there a way to match the collations and make it work?
On Wed, Apr 17, 2019, 07:02 Tom Lane <tgl@sss.pgh.pa.us> wrote:
Thomas Kellerer <spam_eater@gmx.net> writes:
> Laurenz Albe schrieb am 17.04.2019 um 07:03:
>> After debugging into this, it seems that the hstore operator -> cannot
>> be pushed down because of collation problems.
> Do you happen to know if the JSONB operator -> (or ->>) can be pushed down?
A bit of experimentation says that jsonb -> integer can be pushed down,
but not any of the variants involving a text fieldname or result.
Presumably this is because of the heuristic that says not to push down
a collation that didn't arise from the remote column. jsonb -> text
isn't really collation-sensitive, of course, but postgres_fdw has no
good way to know that, since the core code (outside of that operator
itself) doesn't know it either. The assumption is that any function
with at least one input of a collatable type is collation-sensitive.
Here you're getting a default collation from the text literal, and
postgres_fdw doesn't want to assume that the remote end would choose
the same collation.
regards, tom lane
I made a copy of the table and altered the column from hstore to jsonb.
Ran the following query with the same performance issues.
SELECT id, attributes FROM xxx WHERE account_id = 1 AND timestamp >= '2019-01-16 22:34:28.584' AND CAST(attributes ->> 'account_incident_id' as integer) = 2617116 order by timestamp desc limit 10;
Things to Note.
Remove the CAST on the attributes column and the order by results in quick performance
Add order by performance dies
Add CAST without the order by you can go out for dinner and still be waiting for a result set.
On Wed, Apr 17, 2019 at 7:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Thomas Kellerer <spam_eater@gmx.net> writes:
> Laurenz Albe schrieb am 17.04.2019 um 07:03:
>> After debugging into this, it seems that the hstore operator -> cannot
>> be pushed down because of collation problems.
> Do you happen to know if the JSONB operator -> (or ->>) can be pushed down?
A bit of experimentation says that jsonb -> integer can be pushed down,
but not any of the variants involving a text fieldname or result.
Presumably this is because of the heuristic that says not to push down
a collation that didn't arise from the remote column. jsonb -> text
isn't really collation-sensitive, of course, but postgres_fdw has no
good way to know that, since the core code (outside of that operator
itself) doesn't know it either. The assumption is that any function
with at least one input of a collatable type is collation-sensitive.
Here you're getting a default collation from the text literal, and
postgres_fdw doesn't want to assume that the remote end would choose
the same collation.
regards, tom lane
Thanks,
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
I made a copy of the table and altered the column from hstore to jsonb.
Ran the following query with the same performance issues.
SELECT id, attributes FROM xxx WHERE account_id = 1 AND timestamp >= '2019-01-16 22:34:28.584' AND CAST(attributes ->> 'account_incident_id' as integer) = 2617116 order by timestamp desc limit 10;
Things to Note.
Remove the CAST on the attributes column and the order by results in quick performance
Add order by performance dies
Add CAST without the order by you can go out for dinner and still be waiting for a result set.
On Wed, Apr 17, 2019 at 7:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Thomas Kellerer <spam_eater@gmx.net> writes:
> Laurenz Albe schrieb am 17.04.2019 um 07:03:
>> After debugging into this, it seems that the hstore operator -> cannot
>> be pushed down because of collation problems.
> Do you happen to know if the JSONB operator -> (or ->>) can be pushed down?
A bit of experimentation says that jsonb -> integer can be pushed down,
but not any of the variants involving a text fieldname or result.
Presumably this is because of the heuristic that says not to push down
a collation that didn't arise from the remote column. jsonb -> text
isn't really collation-sensitive, of course, but postgres_fdw has no
good way to know that, since the core code (outside of that operator
itself) doesn't know it either. The assumption is that any function
with at least one input of a collatable type is collation-sensitive.
Here you're getting a default collation from the text literal, and
postgres_fdw doesn't want to assume that the remote end would choose
the same collation.
regards, tom lane
Thanks,
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
Hi Jorge,
Can you create a view in the source database such as
CREATE VIEW xxx_id_attributes AS
SELECT id, CAST(attributes->>'account_incident_id' AS integer)
FROM xxx WHERE attributes->'account_incident_id' ~ '^[0-9]+$';
On the remote server, create a foreign table on the new view and perform your test.
Just curious.
-Greg
On Wed, Apr 17, 2019 at 2:59 PM Jorge Torralba <jorge.torralba@gmail.com> wrote:
I made a copy of the table and altered the column from hstore to jsonb.Ran the following query with the same performance issues.SELECT id, attributes FROM xxx WHERE account_id = 1 AND timestamp >= '2019-01-16 22:34:28.584' AND CAST(attributes ->> 'account_incident_id' as integer) = 2617116 order by timestamp desc limit 10;Things to Note.Remove the CAST on the attributes column and the order by results in quick performanceAdd order by performance diesAdd CAST without the order by you can go out for dinner and still be waiting for a result set.On Wed, Apr 17, 2019 at 7:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:Thomas Kellerer <spam_eater@gmx.net> writes:
> Laurenz Albe schrieb am 17.04.2019 um 07:03:
>> After debugging into this, it seems that the hstore operator -> cannot
>> be pushed down because of collation problems.
> Do you happen to know if the JSONB operator -> (or ->>) can be pushed down?
A bit of experimentation says that jsonb -> integer can be pushed down,
but not any of the variants involving a text fieldname or result.
Presumably this is because of the heuristic that says not to push down
a collation that didn't arise from the remote column. jsonb -> text
isn't really collation-sensitive, of course, but postgres_fdw has no
good way to know that, since the core code (outside of that operator
itself) doesn't know it either. The assumption is that any function
with at least one input of a collatable type is collation-sensitive.
Here you're getting a default collation from the text literal, and
postgres_fdw doesn't want to assume that the remote end would choose
the same collation.
regards, tom lane--Thanks,
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
Hi Jorge,
Can you create a view in the source database such as
CREATE VIEW xxx_id_attributes AS
SELECT id, CAST(attributes->>'account_incident_id' AS integer)
FROM xxx WHERE attributes->'account_incident_id' ~ '^[0-9]+$';
On the remote server, create a foreign table on the new view and perform your test.
Just curious.
-Greg
On Wed, Apr 17, 2019 at 2:59 PM Jorge Torralba <jorge.torralba@gmail.com> wrote:
I made a copy of the table and altered the column from hstore to jsonb.Ran the following query with the same performance issues.SELECT id, attributes FROM xxx WHERE account_id = 1 AND timestamp >= '2019-01-16 22:34:28.584' AND CAST(attributes ->> 'account_incident_id' as integer) = 2617116 order by timestamp desc limit 10;Things to Note.Remove the CAST on the attributes column and the order by results in quick performanceAdd order by performance diesAdd CAST without the order by you can go out for dinner and still be waiting for a result set.On Wed, Apr 17, 2019 at 7:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:Thomas Kellerer <spam_eater@gmx.net> writes:
> Laurenz Albe schrieb am 17.04.2019 um 07:03:
>> After debugging into this, it seems that the hstore operator -> cannot
>> be pushed down because of collation problems.
> Do you happen to know if the JSONB operator -> (or ->>) can be pushed down?
A bit of experimentation says that jsonb -> integer can be pushed down,
but not any of the variants involving a text fieldname or result.
Presumably this is because of the heuristic that says not to push down
a collation that didn't arise from the remote column. jsonb -> text
isn't really collation-sensitive, of course, but postgres_fdw has no
good way to know that, since the core code (outside of that operator
itself) doesn't know it either. The assumption is that any function
with at least one input of a collatable type is collation-sensitive.
Here you're getting a default collation from the text literal, and
postgres_fdw doesn't want to assume that the remote end would choose
the same collation.
regards, tom lane--Thanks,
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
Funny that you mentioned that. I created the view a couple of hours ago and that resolve the problem. The view had the order by on the timestamp as well so it does not have to send the data over for sorting. I wish there was a way for postgres to allow control of what happens on which server when using foreign data wrappers. If there is I would like to know about it
On Wed, Apr 17, 2019, 15:40 Greg Spiegelberg <gspiegelberg@gmail.com> wrote:
Hi Jorge,Can you create a view in the source database such asCREATE VIEW xxx_id_attributes ASSELECT id, CAST(attributes->>'account_incident_id' AS integer)FROM xxx WHERE attributes->'account_incident_id' ~ '^[0-9]+$';On the remote server, create a foreign table on the new view and perform your test.Just curious.-GregOn Wed, Apr 17, 2019 at 2:59 PM Jorge Torralba <jorge.torralba@gmail.com> wrote:I made a copy of the table and altered the column from hstore to jsonb.Ran the following query with the same performance issues.SELECT id, attributes FROM xxx WHERE account_id = 1 AND timestamp >= '2019-01-16 22:34:28.584' AND CAST(attributes ->> 'account_incident_id' as integer) = 2617116 order by timestamp desc limit 10;Things to Note.Remove the CAST on the attributes column and the order by results in quick performanceAdd order by performance diesAdd CAST without the order by you can go out for dinner and still be waiting for a result set.On Wed, Apr 17, 2019 at 7:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:Thomas Kellerer <spam_eater@gmx.net> writes:
> Laurenz Albe schrieb am 17.04.2019 um 07:03:
>> After debugging into this, it seems that the hstore operator -> cannot
>> be pushed down because of collation problems.
> Do you happen to know if the JSONB operator -> (or ->>) can be pushed down?
A bit of experimentation says that jsonb -> integer can be pushed down,
but not any of the variants involving a text fieldname or result.
Presumably this is because of the heuristic that says not to push down
a collation that didn't arise from the remote column. jsonb -> text
isn't really collation-sensitive, of course, but postgres_fdw has no
good way to know that, since the core code (outside of that operator
itself) doesn't know it either. The assumption is that any function
with at least one input of a collatable type is collation-sensitive.
Here you're getting a default collation from the text literal, and
postgres_fdw doesn't want to assume that the remote end would choose
the same collation.
regards, tom lane--Thanks,
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.
Funny that you mentioned that. I created the view a couple of hours ago and that resolve the problem. The view had the order by on the timestamp as well so it does not have to send the data over for sorting. I wish there was a way for postgres to allow control of what happens on which server when using foreign data wrappers. If there is I would like to know about it
On Wed, Apr 17, 2019, 15:40 Greg Spiegelberg <gspiegelberg@gmail.com> wrote:
Hi Jorge,Can you create a view in the source database such asCREATE VIEW xxx_id_attributes ASSELECT id, CAST(attributes->>'account_incident_id' AS integer)FROM xxx WHERE attributes->'account_incident_id' ~ '^[0-9]+$';On the remote server, create a foreign table on the new view and perform your test.Just curious.-GregOn Wed, Apr 17, 2019 at 2:59 PM Jorge Torralba <jorge.torralba@gmail.com> wrote:I made a copy of the table and altered the column from hstore to jsonb.Ran the following query with the same performance issues.SELECT id, attributes FROM xxx WHERE account_id = 1 AND timestamp >= '2019-01-16 22:34:28.584' AND CAST(attributes ->> 'account_incident_id' as integer) = 2617116 order by timestamp desc limit 10;Things to Note.Remove the CAST on the attributes column and the order by results in quick performanceAdd order by performance diesAdd CAST without the order by you can go out for dinner and still be waiting for a result set.On Wed, Apr 17, 2019 at 7:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:Thomas Kellerer <spam_eater@gmx.net> writes:
> Laurenz Albe schrieb am 17.04.2019 um 07:03:
>> After debugging into this, it seems that the hstore operator -> cannot
>> be pushed down because of collation problems.
> Do you happen to know if the JSONB operator -> (or ->>) can be pushed down?
A bit of experimentation says that jsonb -> integer can be pushed down,
but not any of the variants involving a text fieldname or result.
Presumably this is because of the heuristic that says not to push down
a collation that didn't arise from the remote column. jsonb -> text
isn't really collation-sensitive, of course, but postgres_fdw has no
good way to know that, since the core code (outside of that operator
itself) doesn't know it either. The assumption is that any function
with at least one input of a collatable type is collation-sensitive.
Here you're getting a default collation from the text literal, and
postgres_fdw doesn't want to assume that the remote end would choose
the same collation.
regards, tom lane--Thanks,
Jorge Torralba
----------------------------
Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.