Re: Partitioning and postgres_fdw optimisations for multi-tenancy - Mailing list pgsql-hackers

From Alexey Kondratov
Subject Re: Partitioning and postgres_fdw optimisations for multi-tenancy
Date
Msg-id 55860ece837dd9cb7661bac46ee727af@postgrespro.ru
Whole thread Raw
In response to Re: Partitioning and postgres_fdw optimisations for multi-tenancy  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Responses Re: Partitioning and postgres_fdw optimisations for multi-tenancy
List pgsql-hackers
On 2020-07-14 15:27, Ashutosh Bapat wrote:
> On Tue, Jul 14, 2020 at 12:48 AM Alexey Kondratov
> <a.kondratov@postgrespro.ru> wrote:
>> I built a simple two node multi-tenant schema for tests, which can be
>> easily set up with attached scripts. It creates three tables 
>> (companies,
>> users, documents) distributed over two nodes. Everything can be found 
>> in
>> this Gist [2] as well.
>> 
>> Some real-life test queries show, that all single-node queries aren't
>> pushed-down to the required node. For example:
>> 
>> SELECT
>>      *
>> FROM
>>      documents
>>      INNER JOIN users ON documents.user_id = users.id
>> WHERE
>>      documents.company_id = 5
>>      AND users.company_id = 5;
> 
> There are a couple of things happening here
> 1. the clauses on company_id in WHERE clause are causing partition
> pruning. Partition-wise join is disabled with partition pruning before
> PG13. In PG13 we have added advanced partition matching algorithm
> which will allow partition-wise join with partition pruning.
> 

I forgot to mention that I use a recent master (991c444e7a) for tests 
with

enable_partitionwise_join = 'on'
enable_partitionwise_aggregate = 'on'

of course. I've also tried postgres_fdw.use_remote_estimate = true 
followed by ANALYSE on both nodes (it is still used in setup.sh script).

BTW, can you, please, share a link to commit / thread about allowing 
partition-wise join and partition pruning to work together in PG13?

> 
> 2. the query has no equality condition on the partition key of the
> tables being joined. Partitionwise join is possible only when there's
> an equality condition on the partition keys (company_id) of the
> joining tables. PostgreSQL's optimizer is not smart enough to convert
> the equality conditions in WHERE clause into equality conditions on
> partition keys. So having those conditions just in WHERE clause does
> not help. Instead please add equality conditions on partition keys in
> JOIN .. ON clause or WHERE clause (only for INNER join).
> 

With adding documents.company_id = users.company_id

SELECT *
FROM
     documents
     INNER JOIN users ON (documents.company_id = users.company_id
             AND documents.user_id = users.id)
WHERE
     documents.company_id = 5
     AND users.company_id = 5;

query plan remains the same.

>> 
>> executed as following
>> 
>>                        QUERY PLAN
>> -------------------------------------------------------
>>   Nested Loop
>>     Join Filter: (documents.user_id = users.id)
>>     ->  Foreign Scan on users_node2 users
>>     ->  Materialize
>>           ->  Foreign Scan on documents_node2 documents
>> 
>> i.e. it uses two foreign scans and does the final join locally. 
>> However,
>> once I specify target partitions explicitly, then the entire query is
>> pushed down to the foreign node:
>> 
>>                         QUERY PLAN
>> ---------------------------------------------------------
>>   Foreign Scan
>>     Relations: (documents_node2) INNER JOIN (users_node2)
>> 
>> Execution time is dropped significantly as well — by more than 3 times
>> even for this small test database. Situation for simple queries with
>> aggregates or joins and aggregates followed by the sharding key filter
>> is the same. Something similar was briefly discussed in this thread 
>> [3].
>> 
>> IIUC, it means that push-down of queries through the postgres_fdw 
>> works
>> perfectly well, the problem is with partition-wise operation detection
>> at the planning time. Currently, partition-wise aggregate routines,
>> e.g., looks for a GROUP BY and checks whether sharding key exists 
>> there
>> or not. After that PARTITIONWISE_AGGREGATE_* flag is set. However, it
>> doesn't look for a content of WHERE clause, so frankly speaking it 
>> isn't
>> a problem, this functionality is not yet implemented.
>> 
>> Actually, sometimes I was able to push down queries with aggregate
>> simply by adding an additional GROUP BY with sharding key, like this:
>> 
>> SELECT
>>      count(*)
>> FROM
>>      documents
>> WHERE
>>      company_id = 5
>> GROUP BY company_id;
> 
> This gets pushed down since GROUP BY clause is on the partition key.
> 

Sure, but it only works *sometimes*, I've never seen most of such simple 
queries with aggregates to be pushed down, e.g.:

SELECT
     sum(id)
FROM
     documents_node2
WHERE
     company_id = 5
GROUP BY
     company_id;

whether 'GROUP BY company_id' is used or not.

>> 
>> Although it seems that it will be easier to start with aggregates,
>> probably we should initially plan a more general solution? For 
>> example,
>> check that all involved tables are filtered by partitioning key and 
>> push
>> down the entire query if all of them target the same foreign server.
>> 
>> Any thoughts?
> 
> I think adding just equality conditions on the partition key will be
> enough. No need for any code change.

So, it hasn't helped. Maybe I could modify some costs to verify that 
push-down of such joins is ever possible?

Anyway, what about aggregates? Partition-wise aggregates work fine for 
queries like

SELECT
     count(*)
FROM
     documents
GROUP BY
     company_id;

but once I narrow it to a single partition with 'WHERE company_id = 5', 
then it is being executed in a very inefficient way — takes all rows 
from remote partition / node and performs aggregate locally. It doesn't 
seem like a problem with query itself.

In my experience, both partition-wise joins and aggregates work well 
with simple GROUP or JOIN by the partitioning key, which corresponds to 
massive multi-partition OLAP queries. However, both stop working for a 
single-partition queries with WHERE, when postgres_fdw and partitioning 
are used. I'd be glad if you share any new guesses of how to make them 
working without code modification.


Thanks
-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: recovering from "found xmin ... from before relfrozenxid ..."
Next
From: "David G. Johnston"
Date:
Subject: Re: DROP relation IF EXISTS Docs and Tests - Bug Fix