Thread: [HACKERS] An issue in remote query optimization
Hi,
Postgres_fdw optimizes remote queries by pushing down the where clause.Create the table on the foreign server and insert some rows in it
create table numbers(a int, b varchar(255));
insert into numbers values(1, 'One');
insert into numbers values(2, 'Two');
insert into numbers values(3, 'Three');
insert into numbers values(4, 'Four');
insert into numbers values(5, 'Five');
insert into numbers values(6, 'Six');
insert into numbers values(7, 'Seven');
insert into numbers values(8, 'Eight');
insert into numbers values(9, 'Nine');
create table numbers(a int, b varchar(255));
insert into numbers values(1, 'One');
insert into numbers values(2, 'Two');
insert into numbers values(3, 'Three');
insert into numbers values(4, 'Four');
insert into numbers values(5, 'Five');
insert into numbers values(6, 'Six');
insert into numbers values(7, 'Seven');
insert into numbers values(8, 'Eight');
insert into numbers values(9, 'Nine');
Step 2:
Create the following objects on the local server
CREATE SERVER pg_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '127.0.0.1', port '5552', dbname 'test');
CREATE USER MAPPING FOR abbasbutt SERVER pg_server OPTIONS (user 'abbasbutt', password 'abc123');
CREATE FOREIGN TABLE foreign_numbers(a int, b varchar(255)) SERVER pg_server OPTIONS (table_name 'numbers');
create or replace function test_pg_fdw() returns void as $$
DECLARE
n varchar;
BEGIN
FOR x IN 1..9 LOOP
select b into n from foreign_numbers where a=x;
raise notice 'Found number %', n;
end loop;
return;
END
$$ LANGUAGE plpgsql;
CREATE SERVER pg_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '127.0.0.1', port '5552', dbname 'test');
CREATE USER MAPPING FOR abbasbutt SERVER pg_server OPTIONS (user 'abbasbutt', password 'abc123');
CREATE FOREIGN TABLE foreign_numbers(a int, b varchar(255)) SERVER pg_server OPTIONS (table_name 'numbers');
create or replace function test_pg_fdw() returns void as $$
DECLARE
n varchar;
BEGIN
FOR x IN 1..9 LOOP
select b into n from foreign_numbers where a=x;
raise notice 'Found number %', n;
end loop;
return;
END
$$ LANGUAGE plpgsql;
Step 3:
Run the test:
select test_pg_fdw();
select test_pg_fdw();
Step 4:
Check the output of auto_explain in the local server log
2017-01-31 00:39:25 PST LOG: duration: 8.388 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..111.91 rows=1 width=516)
Output: b
Remote SQL: SELECT b FROM public.numbers WHERE ((a = 1))
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.315 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..111.91 rows=1 width=516)
Output: b
Remote SQL: SELECT b FROM public.numbers WHERE ((a = 2))
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.250 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..111.91 rows=1 width=516)
Output: b
Remote SQL: SELECT b FROM public.numbers WHERE ((a = 3))
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.257 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..111.91 rows=1 width=516)
Output: b
Remote SQL: SELECT b FROM public.numbers WHERE ((a = 4))
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.271 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..111.91 rows=1 width=516)
Output: b
Remote SQL: SELECT b FROM public.numbers WHERE ((a = 5))
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.251 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..114.91 rows=1 width=516)
Output: b
Filter: (foreign_numbers.a = $3)
Remote SQL: SELECT a, b FROM public.numbers
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.246 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..114.91 rows=1 width=516)
Output: b
Filter: (foreign_numbers.a = $3)
Remote SQL: SELECT a, b FROM public.numbers
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.226 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..114.91 rows=1 width=516)
Output: b
Filter: (foreign_numbers.a = $3)
Remote SQL: SELECT a, b FROM public.numbers
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.223 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..114.91 rows=1 width=516)
Output: b
Filter: (foreign_numbers.a = $3)
Remote SQL: SELECT a, b FROM public.numbers
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 13.963 ms plan:
Query Text: select test_pg_fdw();
Result (cost=0.00..0.26 rows=1 width=0)
Output: test_pg_fdw()
2017-01-31 00:39:25 PST LOG: duration: 8.388 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..111.91 rows=1 width=516)
Output: b
Remote SQL: SELECT b FROM public.numbers WHERE ((a = 1))
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.315 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..111.91 rows=1 width=516)
Output: b
Remote SQL: SELECT b FROM public.numbers WHERE ((a = 2))
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.250 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..111.91 rows=1 width=516)
Output: b
Remote SQL: SELECT b FROM public.numbers WHERE ((a = 3))
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.257 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..111.91 rows=1 width=516)
Output: b
Remote SQL: SELECT b FROM public.numbers WHERE ((a = 4))
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.271 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..111.91 rows=1 width=516)
Output: b
Remote SQL: SELECT b FROM public.numbers WHERE ((a = 5))
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.251 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..114.91 rows=1 width=516)
Output: b
Filter: (foreign_numbers.a = $3)
Remote SQL: SELECT a, b FROM public.numbers
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.246 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..114.91 rows=1 width=516)
Output: b
Filter: (foreign_numbers.a = $3)
Remote SQL: SELECT a, b FROM public.numbers
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.226 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..114.91 rows=1 width=516)
Output: b
Filter: (foreign_numbers.a = $3)
Remote SQL: SELECT a, b FROM public.numbers
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 0.223 ms plan:
Query Text: select b from foreign_numbers where a=x
Foreign Scan on public.foreign_numbers (cost=100.00..114.91 rows=1 width=516)
Output: b
Filter: (foreign_numbers.a = $3)
Remote SQL: SELECT a, b FROM public.numbers
2017-01-31 00:39:25 PST CONTEXT: SQL statement "select b from foreign_numbers where a=x"
PL/pgSQL function test_pg_fdw() line 6 at SQL statement
2017-01-31 00:39:25 PST LOG: duration: 13.963 ms plan:
Query Text: select test_pg_fdw();
Result (cost=0.00..0.26 rows=1 width=0)
Output: test_pg_fdw()
Note that the remote query does not contain the WHERE clause after the 5th invocation.
I understand that this is because PostgreSQL starts using generic plan with pulled up where clause after the 5th invocation hoping that it would be faster since we have skiped planning the query on each invocation, but in this case this decision is causing the query to slow down.
How should we fix this problem?
--
Abbas
Follow us on Twitter
@EnterpriseDB
Visit EnterpriseDB for tutorials, webinars, whitepapers and more
Abbas
Architect
Skype ID: gabbasb
www.enterprisedb.comFollow us on Twitter
@EnterpriseDB
Visit EnterpriseDB for tutorials, webinars, whitepapers and more
On 2017/01/31 18:24, Abbas Butt wrote: > Postgres_fdw optimizes remote queries by pushing down the where clause. > This feature does not work consistently when the query is executed from > within a pl/pgsql function. The optimization works when the function > executes the query for the first 5 times, and fails afterwards. > I understand that this is because PostgreSQL starts using generic plan > with pulled up where clause after the 5th invocation hoping that it > would be faster since we have skiped planning the query on each > invocation, but in this case this decision is causing the query to slow > down. > > How should we fix this problem? ANALYZE for the foreign table doesn't work? Best regards, Etsuro Fujita
On Tue, Jan 31, 2017 at 2:25 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
No.
analyze ts.tickets;
WARNING: skipping "tickets" --- cannot analyze this foreign table
ANALYZE
On 2017/01/31 18:24, Abbas Butt wrote:Postgres_fdw optimizes remote queries by pushing down the where clause.
This feature does not work consistently when the query is executed from
within a pl/pgsql function. The optimization works when the function
executes the query for the first 5 times, and fails afterwards.I understand that this is because PostgreSQL starts using generic plan
with pulled up where clause after the 5th invocation hoping that it
would be faster since we have skiped planning the query on each
invocation, but in this case this decision is causing the query to slow
down.
How should we fix this problem?
ANALYZE for the foreign table doesn't work?
No.
analyze ts.tickets;
WARNING: skipping "tickets" --- cannot analyze this foreign table
ANALYZE
Best regards,
Etsuro Fujita
--
--
Abbas
Follow us on Twitter
@EnterpriseDB
Visit EnterpriseDB for tutorials, webinars, whitepapers and more
Abbas
Architect
Skype ID: gabbasb
www.enterprisedb.comFollow us on Twitter
@EnterpriseDB
Visit EnterpriseDB for tutorials, webinars, whitepapers and more
On 2017/01/31 19:53, Abbas Butt wrote: > On Tue, Jan 31, 2017 at 2:25 AM, Etsuro Fujita > <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote: > On 2017/01/31 18:24, Abbas Butt wrote: > Postgres_fdw optimizes remote queries by pushing down the where > clause. > This feature does not work consistently when the query is > executed from > within a pl/pgsql function. The optimization works when the function > executes the query for the first 5 times, and fails afterwards. > I understand that this is because PostgreSQL starts using > generic plan > with pulled up where clause after the 5th invocation hoping that it > would be faster since we have skiped planning the query on each > invocation, but in this case this decision is causing the query > to slow > down. > How should we fix this problem? > ANALYZE for the foreign table doesn't work? > No. > > analyze ts.tickets; > WARNING: skipping "tickets" --- cannot analyze this foreign table > ANALYZE How the foreign table ts.tickets is defined? Best regards, Etsuro Fujita
On Tue, Jan 31, 2017 at 3:15 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
test=# \d ts.tickets
Foreign table "ts.tickets"
Column | Type | Modifiers | FDW Options
--------+---------+-----------+-------------
id | integer | not null |
Server: mysql_server
FDW Options: (dbname 'msql_test_db', table_name 'tickets')
On 2017/01/31 19:53, Abbas Butt wrote:On Tue, Jan 31, 2017 at 2:25 AM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:
On 2017/01/31 18:24, Abbas Butt wrote:Postgres_fdw optimizes remote queries by pushing down the where
clause.
This feature does not work consistently when the query is
executed from
within a pl/pgsql function. The optimization works when the function
executes the query for the first 5 times, and fails afterwards.I understand that this is because PostgreSQL starts using
generic plan
with pulled up where clause after the 5th invocation hoping that it
would be faster since we have skiped planning the query on each
invocation, but in this case this decision is causing the query
to slow
down.How should we fix this problem?ANALYZE for the foreign table doesn't work?No.
analyze ts.tickets;
WARNING: skipping "tickets" --- cannot analyze this foreign table
ANALYZE
How the foreign table ts.tickets is defined?
test=# \d ts.tickets
Foreign table "ts.tickets"
Column | Type | Modifiers | FDW Options
--------+---------+-----------+-------------
id | integer | not null |
Server: mysql_server
FDW Options: (dbname 'msql_test_db', table_name 'tickets')
Its a foreign table, referring to table 'tickets' defined on MySQL.
Best regards,
Etsuro Fujita
--
--
Abbas
Follow us on Twitter
@EnterpriseDB
Visit EnterpriseDB for tutorials, webinars, whitepapers and more
Abbas
Architect
Skype ID: gabbasb
www.enterprisedb.comFollow us on Twitter
@EnterpriseDB
Visit EnterpriseDB for tutorials, webinars, whitepapers and more
On Tue, Jan 31, 2017 at 5:23 PM, Abbas Butt <abbas.butt@enterprisedb.com> wrote: > > > On Tue, Jan 31, 2017 at 3:15 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> > wrote: >> >> On 2017/01/31 19:53, Abbas Butt wrote: >>> >>> On Tue, Jan 31, 2017 at 2:25 AM, Etsuro Fujita >>> <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote: >>> On 2017/01/31 18:24, Abbas Butt wrote: >> >> >>> Postgres_fdw optimizes remote queries by pushing down the where >>> clause. >>> This feature does not work consistently when the query is >>> executed from >>> within a pl/pgsql function. The optimization works when the >>> function >>> executes the query for the first 5 times, and fails afterwards. >> >> >>> I understand that this is because PostgreSQL starts using >>> generic plan >>> with pulled up where clause after the 5th invocation hoping that >>> it >>> would be faster since we have skiped planning the query on each >>> invocation, but in this case this decision is causing the query >>> to slow >>> down. >> >> >>> How should we fix this problem? >> >> >>> ANALYZE for the foreign table doesn't work? >> >> >>> No. >>> >>> analyze ts.tickets; >>> WARNING: skipping "tickets" --- cannot analyze this foreign table >>> ANALYZE >> >> >> How the foreign table ts.tickets is defined? > > > test=# \d ts.tickets > Foreign table "ts.tickets" > Column | Type | Modifiers | FDW Options > --------+---------+-----------+------------- > id | integer | not null | > Server: mysql_server > FDW Options: (dbname 'msql_test_db', table_name 'tickets') > > Its a foreign table, referring to table 'tickets' defined on MySQL. > Isn't your original complaint about postgres_fdw? You can not tickets, which is a mongo_fdw foreign table, is probably because mongo_fdw has not implemented analyze FDW routine. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Sorry for the confusion.
ANALYZE works for the foreign table 'foreign_numbers'.
test=# analyze foreign_numbers;ANALYZE
test=#
On Tue, Jan 31, 2017 at 5:04 AM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:
Isn't your original complaint about postgres_fdw? You can not tickets,On Tue, Jan 31, 2017 at 5:23 PM, Abbas Butt <abbas.butt@enterprisedb.com> wrote:
>
>
> On Tue, Jan 31, 2017 at 3:15 AM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>
> wrote:
>>
>> On 2017/01/31 19:53, Abbas Butt wrote:
>>>
>>> On Tue, Jan 31, 2017 at 2:25 AM, Etsuro Fujita
>>> <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:
>>> On 2017/01/31 18:24, Abbas Butt wrote:
>>
>>
>>> Postgres_fdw optimizes remote queries by pushing down the where
>>> clause.
>>> This feature does not work consistently when the query is
>>> executed from
>>> within a pl/pgsql function. The optimization works when the
>>> function
>>> executes the query for the first 5 times, and fails afterwards.
>>
>>
>>> I understand that this is because PostgreSQL starts using
>>> generic plan
>>> with pulled up where clause after the 5th invocation hoping that
>>> it
>>> would be faster since we have skiped planning the query on each
>>> invocation, but in this case this decision is causing the query
>>> to slow
>>> down.
>>
>>
>>> How should we fix this problem?
>>
>>
>>> ANALYZE for the foreign table doesn't work?
>>
>>
>>> No.
>>>
>>> analyze ts.tickets;
>>> WARNING: skipping "tickets" --- cannot analyze this foreign table
>>> ANALYZE
>>
>>
>> How the foreign table ts.tickets is defined?
>
>
> test=# \d ts.tickets
> Foreign table "ts.tickets"
> Column | Type | Modifiers | FDW Options
> --------+---------+-----------+-------------
> id | integer | not null |
> Server: mysql_server
> FDW Options: (dbname 'msql_test_db', table_name 'tickets')
>
> Its a foreign table, referring to table 'tickets' defined on MySQL.
>
which is a mongo_fdw foreign table, is probably because mongo_fdw has
not implemented analyze FDW routine.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
--
--
Abbas
Follow us on Twitter
@EnterpriseDB
Visit EnterpriseDB for tutorials, webinars, whitepapers and more
Abbas
Architect
Skype ID: gabbasb
www.enterprisedb.comFollow us on Twitter
@EnterpriseDB
Visit EnterpriseDB for tutorials, webinars, whitepapers and more
Abbas Butt <abbas.butt@enterprisedb.com> writes: > Postgres_fdw optimizes remote queries by pushing down the where clause. > This feature does not work consistently when the query is executed from > within a pl/pgsql function. The optimization works when the function > executes the query for the first 5 times, and fails afterwards. Well, it's switching to a generic plan. Your first five executions look like (ignoring the startup transient): > 2017-01-31 00:39:25 PST LOG: duration: 0.315 ms plan: > Query Text: select b from foreign_numbers where a=x > Foreign Scan on public.foreign_numbers (cost=100.00..111.91 rows=1 > width=516) > Output: b > Remote SQL: SELECT b FROM public.numbers WHERE ((a = 2)) with actual durations ranging from 0.250 to 0.315 ms. After that you get generic plans: > 2017-01-31 00:39:25 PST LOG: duration: 0.251 ms plan: > Query Text: select b from foreign_numbers where a=x > Foreign Scan on public.foreign_numbers (cost=100.00..114.91 rows=1 > width=516) > Output: b > Filter: (foreign_numbers.a = $3) > Remote SQL: SELECT a, b FROM public.numbers with actual durations ranging from 0.223 to 0.251 ms. So where's the slowdown, exactly? It looks to me like the planner has concluded that a custom plan is not enough better than generic to justify repeated planning cost, and it looks to me like it was right. > Note that the remote query does not contain the WHERE clause after the 5th > invocation. You'd have to take that up with the author of the FDW you're using. Optimization of queries involving foreign tables is almost completely on the head of the FDW, and this one seems not to know about pushing down WHERE clauses that involve Param nodes. regards, tom lane