Re: Indices and Foreign Tables - Mailing list pgsql-general

From Torsten Förtsch
Subject Re: Indices and Foreign Tables
Date
Msg-id 533F0E0B.2090105@gmx.net
Whole thread Raw
In response to Re: Indices and Foreign Tables  ("Kohler Manuel (ID SIS)" <manuel.kohler@id.ethz.ch>)
List pgsql-general
On 04/04/14 17:14, Kohler Manuel (ID SIS) wrote:
> Here is the same query with fdw:
>
> db=# EXPLAIN VERBOSE select ds.code, count(*), sum(dsf.size_in_bytes) as
"raw_size",pg_size_pretty(sum(dsf.size_in_bytes))as "size" from data_set_files_fdw dsf, data_sets_fdw ds where
dsf.parent_idis null and dsf.dase_id=ds.id group by ds.code order by raw_size desc; 
>                                                   QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------
>  Sort  (cost=327.81..327.97 rows=64 width=40)
>    Output: ds.code, (count(*)), (sum(dsf.size_in_bytes)), (pg_size_pretty(sum(dsf.size_in_bytes)))
>    Sort Key: (sum(dsf.size_in_bytes))
>    ->  HashAggregate  (cost=325.09..325.89 rows=64 width=40)
>          Output: ds.code, count(*), sum(dsf.size_in_bytes), pg_size_pretty(sum(dsf.size_in_bytes))
>          ->  Hash Join  (cost=270.61..324.45 rows=64 width=40)
>                Output: dsf.size_in_bytes, ds.code
>                Hash Cond: (ds.id = (dsf.dase_id)::bigint)
>                ->  Foreign Scan on public.data_sets_fdw ds  (cost=100.00..148.40 rows=1280 width=40)
>                      Output: ds.id, ds.code, ds.location
>                      Remote SQL: SELECT id, code FROM public.data_sets
>                ->  Hash  (cost=170.48..170.48 rows=10 width=16)
>                      Output: dsf.size_in_bytes, dsf.dase_id
>                      ->  Foreign Scan on public.data_set_files_fdw dsf  (cost=100.00..170.48 rows=10 width=16)
>                            Output: dsf.size_in_bytes, dsf.dase_id
>                            Filter: (dsf.parent_id IS NULL)
>                            Remote SQL: SELECT dase_id, parent_id, size_in_bytes FROM public.data_set_files
> (17 rows)
>
> And also with ANALYZE:
>
> db=# EXPLAIN ANALYZE select ds.code, count(*), sum(dsf.size_in_bytes) as
"raw_size",pg_size_pretty(sum(dsf.size_in_bytes))as "size" from data_set_files_fdw dsf, data_sets_fdw ds where
dsf.parent_idis null and dsf.dase_id=ds.id group by ds.code order by raw_size desc; 
>                                                                           QUERY PLAN
>
--------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Sort  (cost=327.81..327.97 rows=64 width=40) (actual time=219401.864..219414.641 rows=13839 loops=1)
>    Sort Key: (sum(dsf.size_in_bytes))
>    Sort Method: quicksort  Memory: 2283kB
>    ->  HashAggregate  (cost=325.09..325.89 rows=64 width=40) (actual time=219327.664..219363.709 rows=13839 loops=1)
>          ->  Hash Join  (cost=270.61..324.45 rows=64 width=40) (actual time=219127.848..219277.308 rows=13839
loops=1)
>                Hash Cond: (ds.id = (dsf.dase_id)::bigint)
>                ->  Foreign Scan on data_sets_fdw ds  (cost=100.00..148.40 rows=1280 width=40) (actual
time=1.057..77.415rows=13839 loops=1) 
>                ->  Hash  (cost=170.48..170.48 rows=10 width=16) (actual time=219126.713..219126.713 rows=13839
loops=1)
>                      Buckets: 1024  Batches: 1  Memory Usage: 649kB
>                      ->  Foreign Scan on data_set_files_fdw dsf  (cost=100.00..170.48 rows=10 width=16) (actual
time=1082.614..219083.326rows=13839 loops=1) 
>                            Filter: (parent_id IS NULL)
>                            Rows Removed by Filter: 35726596
>  Total runtime: 219438.925 ms
> (1

Do you know that you can combine VERBOSE and ANALYZE in one EXPLAIN?

  EXPLAIN (ANALYZE, VERBOSE) SELECT ...

The problem is that your qualifier is not pushed down to the backend
database. So, you pull (35726596+13839) rows (which is the complete
table) from data_set_files in the remote database, filter them locally
and throw 35726596 of them away. Then you copy the remote data_sets
table (13839 rows). And then you join them.

I don't think use_remote_estimate can help a lot in this situation.

We tried to use postgres_fdw in a similar approach last year. Then
switched to dblink because, although it is possible for FDW to push
qualifiers to the backend, it too often does not do so. Then it copies
large tables over the network and scans them sequentially on the local
machine. Also, LIMIT for instance is never pushed to the backend.

Now, we create functions like this:

CREATE OR REPLACE FUNCTION __get_client_metrics_on(srv TEXT)
RETURNS TABLE(loginid TEXT,
              ...)
AS $def$

    SELECT tb.*
      FROM dblink($1, $$

        -- remote query starts here

            SELECT c.loginid, ...
              FROM clients c
              JOIN ...

        -- remote query ends here

      $$) AS tb(loginid TEXT,
                ...)

$def$ LANGUAGE sql VOLATILE SECURITY definer ROWS ... COST ...;

CREATE OR REPLACE FUNCTION get_client_metrics()
RETURNS TABLE(srv TEXT,
              loginid TEXT,
              ...)
AS $def$

    SELECT s.srvname, rem.*
      FROM production_servers() s
     CROSS JOIN __get_client_metrics_on(s.srvname) rem

$def$ LANGUAGE sql STABLE SECURITY invoker;

The production_servers function is defined more or less like this:

CREATE OR REPLACE FUNCTION production_servers()
RETURNS TABLE(srvname TEXT) AS $def$

  SELECT s.srvname::TEXT
    FROM pg_catalog.pg_foreign_data_wrapper w
    JOIN pg_catalog.pg_foreign_server s ON (s.srvfdw=w.oid)
   WHERE w.fdwname IN ('postgres_fdw', 'dblink_fdw')

$def$ LANGUAGE sql STABLE;


Then you can

  SELECT ... FROM get_client_metrics() JOIN ... WHERE ...


It is much more work than simply import tables from the backend as
foreign tables. But for us it worked very well and we achieved in many
cases much better performance than before. This new approach replaces a
system where all the production databases pour their data into a large
one using bucardo only to do some analysis on the complete data set.

And there is more you can achieve with dblink. For instance you can
execute the remote queries concurrently (you don't even need PLPGSQL for
that, pure SQL is sufficient).

Torsten


pgsql-general by date:

Previous
From: Raymond O'Donnell
Date:
Subject: Re: Understanding setof
Next
From: Jeff Janes
Date:
Subject: Re: Understanding setof