-------------------------------------------------------------------------------------------------------------------------------------
Foreign table
CREATE FOREIGN TABLE user_info ( id bigint , info jsonb
)
SERVER server_test_fdw OPTIONS(SCHEMA_NAME 'public', TABLE_NAME 'user_info_raw' );
-- user_info_raw is a large table (100 million records, 200 GB)
Sample data of info
column
{"key1": 1, "key2": 0.678}
{"key1": 1, "key2": 1.0}
{"key1": 1, "key2": 0.986}
{"key1": 2, "key2": 0.75}
{"key1": 2, "key2": 0.639}
Query on foreign table (updated)
SELECT id, info
FROM user_info
WHERE info ->> 'key1'= '1' -- OR using jsonb_extract_path_text(info, 'key1') = '1'
ORDER BY id
LIMIT 10;
Limit (cost=10750829.63..10750829.65 rows=10 width=40) (actual time=550059.320..550059.326 rows=10 loops=1) -> Sort (cost=10750829.63..10751772.77 rows=377257 width=40) (actual time=550059.318..550059.321 rows=10 loops=1) Sort Key: id Sort Method: top-N heapsort Memory: 26kB -> Foreign Scan on user_info (cost=100.00..10742677.24 rows=377257 width=40) (actual time=1.413..536718.366 rows=68281020 loops=1) Filter: ((info ->> 'key1'::text) = '1'::text) Rows Removed by Filter: 7170443Planning time: 4.097 msExecution time: 550059.597 ms
Query on user_info_raw (updated)
EXPLAIN ANALYSE
SELECT id, info
FROM user_info_raw
WHERE info ->> 'key1'= '1'
ORDER BY id
LIMIT 10;
Limit (cost=0.57..1296.95 rows=10 width=59) (actual time=0.043..0.073 rows=10 loops=1) -> Index Scan using idx_user_info_raw_info on user_info_raw (cost=0.57..68882850.88 rows=531346 width=59) (actual time=0.042..0.070 rows=10 loops=1) Filter: ((info ->> 'key1'::text) = '1'::text)Planning time: 0.192 msExecution time: 0.102 ms
select pg_size_pretty(pg_table_size('user_info_raw'));pg_size_pretty
----------------223 GB
It takes 10 milliseconds to execute the query on the user_info_raw (remote server).
But, It takes a lot of time when using theuser_info
foreign table. When I remove ORDER BY id
, the query executes very fast.
I think that my query on foreign table should send to the remote server for executing, but it's not, I don't know why, may be due to of this excerpt from postgres_fdw document
By default, only WHERE clauses using built-in operators and functions will be considered for execution on the remote server. Clauses involving non-built-in functions are checked locally after rows are fetched. If such functions are available on the remote server and can be relied on to produce the same results as they do locally, performance can be improved by sending such WHERE clauses for remote execution