Thread: BUG #15287: postgres_fdw: the "WHERE date_trunc('day',dt) = 'YYYY-MM-DD' does not push to remote.
BUG #15287: postgres_fdw: the "WHERE date_trunc('day',dt) = 'YYYY-MM-DD' does not push to remote.
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 15287 Logged by: Ireneusz Pluta Email address: ipluta@wp.pl PostgreSQL version: 10.3 Operating system: FreeBSD 11.1 Description: the clause: WHERE date_trunc('day'::text, dt::timestamp without time zone) = 'yyyy-mm-dd' does not get shipped to the foreign server. as seen on #postgresql: [18:13] <RhodiumToad> date_trunc takes a mix of collatable and non-collatable parameters (text is collatable, timestamp is not) [18:13] <RhodiumToad> this is why I wanted the debug version of the query tree, to see the collation ids [18:14] <RhodiumToad> so given date_trunc('day', col), what happens is that the walker first recurses into the args: [18:14] <RhodiumToad> 'day' is a Const node with default collation, so the inner_cxt.state gets set to NONE [18:15] <RhodiumToad> "col" is a remote Var of a non-collatable type, so inner_cxt.state still gets set to NONE [18:15] <RhodiumToad> but then in the T_FuncExpr case in the walker, we get to this line (468 in master): [18:16] <RhodiumToad> if (fe->inputcollid == InvalidOid) ; else if (inner_cxt.state != FDW_COLLATE_SAFE || ...) return false; [18:17] <RhodiumToad> fe->inputcollid is DEFAULT_COLLATION_OID, not InvalidOid, and inner_cxt.state is NONE, not SAFE, so the walker bails out at that point [18:17] <RhodiumToad> and reports the expression as "not safe for remote" [18:18] <irqq_> should it be reported as a bug and expected to be corrected? [18:18] <RhodiumToad> basically this is confusion over the difference between collid=InvalidOid, meaning "not of a collatable type", and =DEFAULT_COLLATION_OID meaning "of a collatable type but no specified collation" [18:18] <RhodiumToad> yes it should be reported as a bug
Re: BUG #15287: postgres_fdw: the "WHERE date_trunc('day', dt) ='YYYY-MM-DD' does not push to remote.
From
Andres Freund
Date:
Hi, On 2018-07-20 16:31:18 +0000, PG Bug reporting form wrote: > the clause: > WHERE date_trunc('day'::text, dt::timestamp without time zone) = > 'yyyy-mm-dd' > does not get shipped to the foreign server. You're probably going to have a higher likelihood of getting the bug fixed quickly if you'd include a full reproducer. Greetings, Andres Freund
Re: BUG #15287: postgres_fdw: the "WHERE date_trunc('day', dt) ='YYYY-MM-DD' does not push to remote.
From
Ireneusz Pluta
Date:
W dniu 2018-07-20 o 18:39, Andres Freund pisze: > Hi, > > On 2018-07-20 16:31:18 +0000, PG Bug reporting form wrote: >> the clause: >> WHERE date_trunc('day'::text, dt::timestamp without time zone) = >> 'yyyy-mm-dd' >> does not get shipped to the foreign server. > You're probably going to have a higher likelihood of getting the bug > fixed quickly if you'd include a full reproducer. > > Greetings, > > Andres Freund Sure, but RhodiumToad was so kind promising to follow up on this, so I skipped that chore :-).
Re: BUG #15287: postgres_fdw: the "WHERE date_trunc('day', dt) = 'YYYY-MM-DD' does not push to remote.
From
Andrew Gierth
Date:
>>>>> "PG" == PG Bug reporting form <noreply@postgresql.org> writes: PG> the clause: PG> WHERE date_trunc('day'::text, dt::timestamp without time zone) = 'yyyy-mm-dd' PG> does not get shipped to the foreign server. Just to expand and clarify my response from irc: This is the relevant part of the parse tree (as shown by debug_print_parse) annotated for convenience: {FUNCEXPR :funcid 2020 = date_trunc(text,timestamp without time zone) :funcresulttype 1114 = timestamp without time zone :funcretset false :funcvariadic false :funcformat 0 :funccollid 0 :inputcollid 100 = "default" :args ( {CONST :consttype 25 :consttypmod -1 :constcollid 100 = "default" :constlen -1 :constbyval false :constisnull false :location 55 :constvalue 7 [ 28 0 0 0 100 97 121 ] } {VAR -- the foreign table is the only table in the query, :varno 1 -- so this var is remote :varattno 3 :vartype 1114 = timestamp without time zone :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 3 :location 68 } ) :location 44 } foreign_expr_walker's FuncExpr case first walks the args, which gives an inner_cxt.state == FDW_COLLATE_NONE since the Const's collid of "default" is explicitly treated the same as InvalidOid (but note that this is NOT what assign_collations_walker did: it assigned the function's inputcollid as 100 (default), not InvalidOid). Then we get here: /* * If function's input collation is not derived from a foreign * Var, it can't be sent to remote. */ if (fe->inputcollid == InvalidOid) /* OK, inputs are all noncollatable */ ; else if (inner_cxt.state != FDW_COLLATE_SAFE || fe->inputcollid != inner_cxt.collation) return false; so with inputcollid==100 and inner_cxt.state == FDW_COLLATE_NONE, the walker bails out at this point. -- Andrew (irc:RhodiumToad)
Re: BUG #15287: postgres_fdw: the "WHERE date_trunc('day', dt) = 'YYYY-MM-DD' does not push to remote.
From
Tom Lane
Date:
Ireneusz Pluta <ipluta@wp.pl> writes: > W dniu 2018-07-20 o 18:39, Andres Freund pisze: >> You're probably going to have a higher likelihood of getting the bug >> fixed quickly if you'd include a full reproducer. > Sure, but RhodiumToad was so kind promising to follow up on this, so I skipped that chore :-). It may be a bug, but I'm afraid it's a can't-fix one, at least for the foreseeable future. postgres_fdw is correctly determining that the date_trunc function's input collation is being chosen on the basis of a defaultly-collated literal, and that makes it unsafe to ship. The situation is not different from the case of, say, 'foo'::text > 'bar'::text (ignoring the likelihood that that'd get const-folded before it got here). If that expression is evaluated locally, the result will depend on the local database's default collation; whereas if we send it for remote execution, the result will depend on the remote database's default collation, and could very well be different. In reality, date_trunc() pays no attention to its input collation; but postgres_fdw has no way to know that, so it can't safely ship this expression. If we had a way to know that the remote database's default collation acts the same as the local one's, we could use different and more forgiving rules about what can be shipped; but I'm not sure how we could know that reliably. (We could perhaps check that it was named the same, but in cross-platform situations that proves little.) Alternatively, if postgres_fdw could know which functions don't really pay attention to their input collation, it could skip this check for those functions. But that's not an easy thing to fix either. regards, tom lane
Re: BUG #15287: postgres_fdw: the "WHERE date_trunc('day', dt) = 'YYYY-MM-DD' does not push to remote.
From
Andrew Gierth
Date:
>>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: >> the clause: >> WHERE date_trunc('day'::text, dt::timestamp without time zone) = >> 'yyyy-mm-dd' >> does not get shipped to the foreign server. Andres> You're probably going to have a higher likelihood of getting Andres> the bug fixed quickly if you'd include a full reproducer. On remote: create table foreigntab (dt timestamp); insert into foreigntab select timestamp '2000-01-01' + (random()*568036800) * interval '1 second' from generate_series(1,100000); create index on foreigntab (date_trunc('day',dt)); analyze foreigntab; On local: -- create foreigntab as a foreign table or import the foreign schema -- e.g. create server foreigndb foreign data wrapper postgres_fdw options (dbname 'foreigndb'); create user mapping for postgres server foreigndb; import foreign schema public from server foreigndb into public; postgres=# explain analyze select * from foreigntab where date_trunc('day', dt) = '2018-07-20'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------- Foreign Scan on foreigntab (cost=100.00..199.60 rows=13 width=8) (actual time=637.399..637.399 rows=0 loops=1) Filter: (date_trunc('day'::text, dt) = '2018-07-20 00:00:00'::timestamp without time zone) Rows Removed by Filter: 100000 -- Andrew (irc:RhodiumToad)
Re: BUG #15287: postgres_fdw: the "WHERE date_trunc('day', dt) = 'YYYY-MM-DD' does not push to remote.
From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> Alternatively, if postgres_fdw could know which functions don't Tom> really pay attention to their input collation, it could skip this Tom> check for those functions. But that's not an easy thing to fix Tom> either. Wild idea: we have quite a few functions (e.g. date_trunc, date_part, timezone, encode, decode, make_timestamptz, plus various extension functions such as in pgcrypto) that take a "text" parameter which is nothing more than a poor-man's enum, or otherwise represents something which isn't actually free text. Perhaps these should use "name" instead, or in some cases an enum type? (or in the case of date_trunc and extract, maybe they need to be split up into separate functions so that we can properly distinguish the immutable from the mutable cases) -- Andrew (irc:RhodiumToad)
Re: BUG #15287: postgres_fdw: the "WHERE date_trunc('day', dt) ='YYYY-MM-DD' does not push to remote.
From
Alvaro Herrera
Date:
On 2018-Jul-20, Andrew Gierth wrote: > (or in the case of date_trunc and extract, maybe they need to be split > up into separate functions so that we can properly distinguish the > immutable from the mutable cases) I think for partition pruning this would be useful too. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: BUG #15287: postgres_fdw: the "WHERE date_trunc('day', dt) = 'YYYY-MM-DD' does not push to remote.
From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > Wild idea: we have quite a few functions (e.g. date_trunc, date_part, > timezone, encode, decode, make_timestamptz, plus various extension > functions such as in pgcrypto) that take a "text" parameter which is > nothing more than a poor-man's enum, or otherwise represents something > which isn't actually free text. Perhaps these should use "name" instead, > or in some cases an enum type? Hm. It'd definitely be cool if constant text parameters could be const-folded to some enum-like representation, so that at runtime these functions would just do integer comparisons rather than string comparisons to decide what to do. Not sure about the backwards compatibility considerations though. However, that could only fix the OP's problem for these particular functions, which are a small fraction of the ones we have that take collatable types but don't really care about collation. I kinda feel like we should have invented an "ignores collation" function property, so that such functions could be identified --- that would not only let postgres_fdw handle this honestly, but we could refrain from throwing unmatched-collations errors for cases where it doesn't really matter. Maybe it's not too late to invent that in future. If the parser sets inputcollid = 0 for any function marked that way, then if the function is incorrectly marked as ignoring collation it would get an error at runtime, so it seems like we could add it safely. > (or in the case of date_trunc and extract, maybe they need to be split > up into separate functions so that we can properly distinguish the > immutable from the mutable cases) Different issue ... and not actually relevant to this case, since this form of date_trunc is immutable. regards, tom lane
Re: BUG #15287: postgres_fdw: the "WHERE date_trunc('day', dt) ='YYYY-MM-DD' does not push to remote.
From
Jeff Janes
Date:
On Fri, Jul 20, 2018 at 2:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
However, that could only fix the OP's problem for these particular
functions, which are a small fraction of the ones we have that take
collatable types but don't really care about collation. I kinda feel
like we should have invented an "ignores collation" function property,
so that such functions could be identified --- that would not only let
postgres_fdw handle this honestly, but we could refrain from throwing
unmatched-collations errors for cases where it doesn't really matter.
Maybe it's not too late to invent that in future. If the parser sets
inputcollid = 0 for any function marked that way, then if the function is
incorrectly marked as ignoring collation it would get an error at runtime,
so it seems like we could add it safely.
Would this automatically apply to operators as well, like hstore's -> operator, once the fetchval function was marked? The non-shippability of that can be a major problem.
Cheers,
Jeff
Re: BUG #15287: postgres_fdw: the "WHERE date_trunc('day', dt) = 'YYYY-MM-DD' does not push to remote.
From
Tom Lane
Date:
Jeff Janes <jeff.janes@gmail.com> writes: > On Fri, Jul 20, 2018 at 2:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> However, that could only fix the OP's problem for these particular >> functions, which are a small fraction of the ones we have that take >> collatable types but don't really care about collation. I kinda feel >> like we should have invented an "ignores collation" function property, >> so that such functions could be identified --- that would not only let >> postgres_fdw handle this honestly, but we could refrain from throwing >> unmatched-collations errors for cases where it doesn't really matter. > Would this automatically apply to operators as well, like hstore's -> > operator, once the fetchval function was marked? Sure, just like a function's volatility (for example) applies to operators based on it. It'd require a bit of additional code to make that happen, but I'd expect that to be included in any patch for this purpose. regards, tom lane