Thread: postgres_fdw aggregate pushdown for group by with expressions

postgres_fdw aggregate pushdown for group by with expressions

From
Michał Kłeczek
Date:
Hi,

I have the following foreign table:

CREATE FOREIGN TABLE t1 (
  grouping_column text,
  date_column date,
  whatever_data int
);

The query is:

SELECT
  sum(whatever_data)
FROM
  t1
GROUP BY
  grouping_colulmn, extract(YEAR FROM date_column), extract(MONTH FROM date_column);

From my (preliminary) testing postgres_fdw will not push down this aggregate query
- it will happily push down query with only “grouping_column” or “grouping_column, date_column" in GROUP BY

Is there a way to somehow push down the query with expressions in GROUP BY?

Thanks,
Michal


Re: postgres_fdw aggregate pushdown for group by with expressions

From
Michał Kłeczek
Date:

> On 3 Mar 2024, at 10:34, Michał Kłeczek <michal@kleczek.org> wrote:
>
> Hi,
>
> I have the following foreign table:
>
> CREATE FOREIGN TABLE t1 (
>  grouping_column text,
>  date_column date,
>  whatever_data int
> );
>
> The query is:
>
> SELECT
>  sum(whatever_data)
> FROM
>  t1
> GROUP BY
>  grouping_colulmn, extract(YEAR FROM date_column), extract(MONTH FROM date_column);
>
> From my (preliminary) testing postgres_fdw will not push down this aggregate query
> - it will happily push down query with only “grouping_column” or “grouping_column, date_column" in GROUP BY
>
> Is there a way to somehow push down the query with expressions in GROUP BY?

I’ve performed some more tests and it seems expressions with “extract” function are not pushed down at all -
the WHERE criteria from the following query are not pushed down as well and filter is performed locally:

SELECT
  *
FROM
  t1
WHERE extract(YEAR FROM date_column) = 2023

I was under impression that “extract” meets all criteria for pushable expressions - looks like I am wrong?

What would be the expression to achieve the same but such that postgres_fdw would push it down?

Thanks,
Michal




Re: postgres_fdw aggregate pushdown for group by with expressions

From
Adrian Klaver
Date:
On 3/3/24 05:16, Michał Kłeczek wrote:
> 
> 
>> On 3 Mar 2024, at 10:34, Michał Kłeczek <michal@kleczek.org> wrote:
>>
>> Hi,
>>
>> I have the following foreign table:
>>
>> CREATE FOREIGN TABLE t1 (
>>   grouping_column text,
>>   date_column date,
>>   whatever_data int
>> );
>>
>> The query is:
>>
>> SELECT
>>   sum(whatever_data)
>> FROM
>>   t1
>> GROUP BY
>>   grouping_colulmn, extract(YEAR FROM date_column), extract(MONTH FROM date_column);
>>
>>  From my (preliminary) testing postgres_fdw will not push down this aggregate query
>> - it will happily push down query with only “grouping_column” or “grouping_column, date_column" in GROUP BY
>>
>> Is there a way to somehow push down the query with expressions in GROUP BY?
> 
> I’ve performed some more tests and it seems expressions with “extract” function are not pushed down at all -
> the WHERE criteria from the following query are not pushed down as well and filter is performed locally:
> 
> SELECT
>    *
> FROM
>    t1
> WHERE extract(YEAR FROM date_column) = 2023
> 
> I was under impression that “extract” meets all criteria for pushable expressions - looks like I am wrong?
> 
> What would be the expression to achieve the same but such that postgres_fdw would push it down?

What Postgres versions on both ends?

What does

select proname, prosrc,  provolatile from pg_proc where proname = 'extract';

return?

> 
> Thanks,
> Michal
> 
> 
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: postgres_fdw aggregate pushdown for group by with expressions

From
Michał Kłeczek
Date:
Thanks for help.

I’ve performed some more tests and it seems expressions with “extract” function are not pushed down at all -
the WHERE criteria from the following query are not pushed down as well and filter is performed locally:
SELECT
  *
FROM
  t1
WHERE extract(YEAR FROM date_column) = 2023
I was under impression that “extract” meets all criteria for pushable expressions - looks like I am wrong?
What would be the expression to achieve the same but such that postgres_fdw would push it down?

What Postgres versions on both ends?

16.1 on both sides


What does

select proname, prosrc,  provolatile from pg_proc where proname = 'extract';

return?

extract | extract_date | i

(and bunch of similar records for other types - for timestampz provolatile is “s”)

Michal
=?utf-8?Q?Micha=C5=82_K=C5=82eczek?= <michal@kleczek.org> writes:
> I’ve performed some more tests and it seems expressions with “extract” function are not pushed down at all -

Yeah :-(.  I traced through this, and it seems it's a collation
problem.  Internally, that call looks like
    extract('year'::text, date_column)
The text constant is marked as having collation "default", which means
that extract() is marked as having input collation "default", and then
it falls foul of this rule:

 * ... An expression is considered safe to send
 * only if all operator/function input collations used in it are traceable to
 * Var(s) of the foreign table.  That implies that if the remote server gets
 * a different answer than we do, the foreign table's columns are not marked
 * with collations that match the remote table's columns, which we can
 * consider to be user error.

Of course, extract() doesn't actually care about collation, but
postgres_fdw has no good way to know that.  Nor does it trust the
remote server to have the same set of collations the local one does,
so it doesn't want to try to fix this by sending explicit COLLATE
clauses.

Somebody ought to work on improving that mess sometime.  One thought
that comes to mind is to have a server option authorizing postgres_fdw
to believe that all local collations exist on the remote side.

            regards, tom lane



Re: postgres_fdw aggregate pushdown for group by with expressions

From
Michał Kłeczek
Date:
Ok, so that means the workaround is to create an extension with year(date) and month(date) functions that internally
wouldcall extract. 

Not ideal but workable I guess.

—
Michał

> On 3 Mar 2024, at 18:42, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> =?utf-8?Q?Micha=C5=82_K=C5=82eczek?= <michal@kleczek.org> writes:
>> I’ve performed some more tests and it seems expressions with “extract” function are not pushed down at all -
>
> Yeah :-(.  I traced through this, and it seems it's a collation
> problem.  Internally, that call looks like
>    extract('year'::text, date_column)
> The text constant is marked as having collation "default", which means
> that extract() is marked as having input collation "default", and then
> it falls foul of this rule:
>
> * ... An expression is considered safe to send
> * only if all operator/function input collations used in it are traceable to
> * Var(s) of the foreign table.  That implies that if the remote server gets
> * a different answer than we do, the foreign table's columns are not marked
> * with collations that match the remote table's columns, which we can
> * consider to be user error.
>
> Of course, extract() doesn't actually care about collation, but
> postgres_fdw has no good way to know that.  Nor does it trust the
> remote server to have the same set of collations the local one does,
> so it doesn't want to try to fix this by sending explicit COLLATE
> clauses.
>
> Somebody ought to work on improving that mess sometime.  One thought
> that comes to mind is to have a server option authorizing postgres_fdw
> to believe that all local collations exist on the remote side.
>
>            regards, tom lane



Re: postgres_fdw aggregate pushdown for group by with expressions

From
Michał Kłeczek
Date:

> On 3 Mar 2024, at 18:42, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> =?utf-8?Q?Micha=C5=82_K=C5=82eczek?= <michal@kleczek.org> writes:
>> I’ve performed some more tests and it seems expressions with “extract” function are not pushed down at all -
>
> Yeah :-(.  I traced through this, and it seems it's a collation
> problem.  Internally, that call looks like
>     extract('year'::text, date_column)
> The text constant is marked as having collation "default", which means
> that extract() is marked as having input collation "default", and then
> it falls foul of this rule:
[snip]

Thanks for explanation - I have a follow-up question.

Does that mean that *any* expression containing text constants is not going to be pushed down??
That would be a really serious issues I’d say.

—
Michal


Re: postgres_fdw aggregate pushdown for group by with expressions

From
Michał Kłeczek
Date:

> On 4 Mar 2024, at 07:46, Michał Kłeczek <michal@kleczek.org> wrote:
>
>
>
>> On 3 Mar 2024, at 18:42, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> =?utf-8?Q?Micha=C5=82_K=C5=82eczek?= <michal@kleczek.org> writes:
>>> I’ve performed some more tests and it seems expressions with “extract” function are not pushed down at all -
>>
>> Yeah :-(.  I traced through this, and it seems it's a collation
>> problem.  Internally, that call looks like
>>     extract('year'::text, date_column)
>> The text constant is marked as having collation "default", which means
>> that extract() is marked as having input collation "default", and then
>> it falls foul of this rule:
> [snip]
>
> Thanks for explanation - I have a follow-up question.
>
> Does that mean that *any* expression containing text constants is not going to be pushed down??
> That would be a really serious issues I’d say.

I’m afraid it is affecting all expressions containing functions.
This is really troublesome as it blocks push down for some basic (and collation independent) functions like coalesce.

—
Michal


Re: postgres_fdw aggregate pushdown for group by with expressions

From
Michał Kłeczek
Date:
Hi All,

> On 3 Mar 2024, at 10:34, Michał Kłeczek <michal@kleczek.org> wrote:
>
> Hi,
>
> I have the following foreign table:
>
> CREATE FOREIGN TABLE t1 (
>  grouping_column text,
>  date_column date,
>  whatever_data int
> );
>
> The query is:
>
> SELECT
>  sum(whatever_data)
> FROM
>  t1
> GROUP BY
>  grouping_colulmn, extract(YEAR FROM date_column), extract(MONTH FROM date_column);
>
> From my (preliminary) testing postgres_fdw will not push down this aggregate query
> - it will happily push down query with only “grouping_column” or “grouping_column, date_column" in GROUP BY

I was able to make it work by creating an extension and installing it on both sides with functions:

year(date) IMMUTABLE
month(date) IMMUTABLE

But it looks there are more issues - the following queries work fine (ie. are pushed down to remote server):

SELECT * FROM t1 WHERE grouping_column LIKE ‘abcd%’;
and
SELECT sum(whatever_data) FROM t1 GROUP BY grouping_column, year(date_column), month(date_column)

But together - unfortunately not:

SELECT grouping_column, sum(whatever_data) FROM t1 WHERE grouping_column LIKE ‘abcd%' GROUP BY grouping_column,
year(date_column)

In this case aggregate is performed locally.
Not sure if this is because of statistics (but I set fdw_tuple_cost to a very high value 99999 to try to force planner
topush down aggregates) 

—
Michal


=?utf-8?Q?Micha=C5=82_K=C5=82eczek?= <michal@kleczek.org> writes:
> Does that mean that *any* expression containing text constants is not going to be pushed down??
> That would be a really serious issues I’d say.

If it contains text constants and not text variables, possibly so.
Keep in mind though that this rule is applied after constant-folding,
so that what we're left with is normally going to contain variables
at every level.  extract() is problematic because it combines a
text constant with a datetime (collation-less) variable.

            regards, tom lane