Thread: MERGE and parsing with prepared statements

MERGE and parsing with prepared statements

From
Justin Pryzby
Date:
We've used INSERT ON CONFLICT for a few years (with partitions as the target).
That's also combined with prepared statements, for bulk loading.

I was looking to see if we should use MERGE (probably not, but looking anyway).
And came across this behavior.  I'm not sure if it's any issue.

CREATE TABLE CustomerAccount (CustomerId int, Balance float);

PREPARE p AS
MERGE INTO CustomerAccount CA
USING (SELECT $1 AS CustomerId, $2 AS TransactionValue) AS T
ON CA.CustomerId = T.CustomerId
WHEN NOT MATCHED THEN
  INSERT (CustomerId, Balance)
  VALUES (T.CustomerId, T.TransactionValue)
WHEN MATCHED THEN
  UPDATE SET Balance = Balance + TransactionValue;

ERROR:  operator does not exist: integer = text
LINE 3: ON CA.CustomerId = T.CustomerId

        postgres: pryzbyj postgres [local] PREPARE(+0x2337be) [0x56108322e7be]
        postgres: pryzbyj postgres [local] PREPARE(oper+0x198) [0x56108322f1fb]
        postgres: pryzbyj postgres [local] PREPARE(make_op+0x7e) [0x56108322f55a]
        postgres: pryzbyj postgres [local] PREPARE(+0x228f2b) [0x561083223f2b]
        postgres: pryzbyj postgres [local] PREPARE(+0x227aa9) [0x561083222aa9]
        postgres: pryzbyj postgres [local] PREPARE(transformExpr+0x1c) [0x5610832227f9]
        postgres: pryzbyj postgres [local] PREPARE(transformMergeStmt+0x339) [0x56108322d988]
        postgres: pryzbyj postgres [local] PREPARE(transformStmt+0x70) [0x5610831f4071]
        postgres: pryzbyj postgres [local] PREPARE(+0x1fa350) [0x5610831f5350]
        postgres: pryzbyj postgres [local] PREPARE(transformTopLevelStmt+0x11) [0x5610831f5385]
        postgres: pryzbyj postgres [local] PREPARE(parse_analyze_varparams+0x5b) [0x5610831f54f4]
        postgres: pryzbyj postgres [local] PREPARE(pg_analyze_and_rewrite_varparams+0x38) [0x5610834bcdfe]
        postgres: pryzbyj postgres [local] PREPARE(PrepareQuery+0xcc) [0x561083292155]
        postgres: pryzbyj postgres [local] PREPARE(standard_ProcessUtility+0x4ea) [0x5610834c31a0]

Why is $1 construed to be of type text ?



Re: MERGE and parsing with prepared statements

From
Matthias van de Meent
Date:
On Thu, 14 Jul 2022, 18:26 Justin Pryzby, <pryzby@telsasoft.com> wrote:
>
> We've used INSERT ON CONFLICT for a few years (with partitions as the target).
> That's also combined with prepared statements, for bulk loading.
>
> I was looking to see if we should use MERGE (probably not, but looking anyway).
> And came across this behavior.  I'm not sure if it's any issue.
>
> CREATE TABLE CustomerAccount (CustomerId int, Balance float);
>
> PREPARE p AS
> MERGE INTO CustomerAccount CA
> USING (SELECT $1 AS CustomerId, $2 AS TransactionValue) AS T
> ON CA.CustomerId = T.CustomerId
> WHEN NOT MATCHED THEN
>   INSERT (CustomerId, Balance)
>   VALUES (T.CustomerId, T.TransactionValue)
> WHEN MATCHED THEN
>   UPDATE SET Balance = Balance + TransactionValue;
>
> ERROR:  operator does not exist: integer = text
> LINE 3: ON CA.CustomerId = T.CustomerId
>
> Why is $1 construed to be of type text ?

The select statement that generates the row type of T `(select $1 CID,
$2 TxV) AS T` does not put type bounds on the input parameters, so it
remains `unknown` for the scope of that subselect. Once stored into
the row type, the type of that column defaults to text, as a row type
should not have 'unknown'-typed columns.

You'll see the same issue with other subselects that select input
parameters without casts, such as `select a from (select $1 a) A where
A.a = 1;`. It's a pre-existing issue that has popped up earlier, and I
think it's not something we've planned to fix in backbranches.

Kind regards,

Matthias van de Meent



Re: MERGE and parsing with prepared statements

From
Justin Pryzby
Date:
On Fri, Jul 15, 2022 at 11:25:35AM +0200, Matthias van de Meent wrote:
> On Thu, 14 Jul 2022, 18:26 Justin Pryzby, <pryzby@telsasoft.com> wrote:
> >
> > Why is $1 construed to be of type text ?
> 
> The select statement that generates the row type of T `(select $1 CID,
> $2 TxV) AS T` does not put type bounds on the input parameters, so it
> remains `unknown` for the scope of that subselect. Once stored into
> the row type, the type of that column defaults to text, as a row type
> should not have 'unknown'-typed columns.

Thanks for looking into it.

I see now that the same thing can happen with "ON CONFLICT" if used with a
subselect.

PREPARE p AS INSERT INTO t SELECT a FROM (SELECT $1 AS a)a
     ON CONFLICT (i) DO UPDATE SET i=excluded.i;
ERROR:  column "i" is of type integer but expression is of type text

It seems a bit odd that it's impossible to use merge with prepared statements
without specifically casting the source types (which I did now to continue my
experiment).

-- 
Justin



Re: MERGE and parsing with prepared statements

From
Alvaro Herrera
Date:
On 2022-Jul-15, Justin Pryzby wrote:

> On Fri, Jul 15, 2022 at 11:25:35AM +0200, Matthias van de Meent wrote:

> Thanks for looking into it.

Definitely!  Thanks, Matthias.

> I see now that the same thing can happen with "ON CONFLICT" if used with a
> subselect.
> 
> PREPARE p AS INSERT INTO t SELECT a FROM (SELECT $1 AS a)a
>      ON CONFLICT (i) DO UPDATE SET i=excluded.i;
> ERROR:  column "i" is of type integer but expression is of type text

Right, I didn't think that MERGE was doing anything peculiar in this
respect.

> It seems a bit odd that it's impossible to use merge with prepared statements
> without specifically casting the source types (which I did now to continue my
> experiment).

I have no comments on this.  Maybe it can be improved, but I don't know
how.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/



Re: MERGE and parsing with prepared statements

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> On 2022-Jul-15, Justin Pryzby wrote:
>> I see now that the same thing can happen with "ON CONFLICT" if used with a
>> subselect.
>> 
>> PREPARE p AS INSERT INTO t SELECT a FROM (SELECT $1 AS a)a
>> ON CONFLICT (i) DO UPDATE SET i=excluded.i;
>> ERROR:  column "i" is of type integer but expression is of type text

> Right, I didn't think that MERGE was doing anything peculiar in this
> respect.

Yeah.  The current theory about this is that if we haven't assigned a
type to an unknown-type parameter (or literal) that is an output
column of a sub-SELECT, we will as a rule force it to text.
That MERGE USING clause is a sub-SELECT, so that rule applies.

There is a hoary old exception to that rule, which is that if you
write INSERT INTO tab SELECT ..., $1, ...
we will figure out the type of the column of "tab" that $1 is going
into, and force $1 to that type instead of text.  It looks like this
also works in INSERT ... VALUES.  You could make a case that MERGE
should be equally smart, but it's not clear to me that the info is
available sufficiently close by to make it reasonable to do that.
It looks like the MERGE syntax has a couple of levels of indirection,
which'd probably be enough to put the kibosh on that idea -- in
particular, AFAICS there might not be a unique target column
corresponding to a given data_source column.

            regards, tom lane



Re: MERGE and parsing with prepared statements

From
"David G. Johnston"
Date:
On Fri, Jul 15, 2022 at 11:40 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2022-Jul-15, Justin Pryzby wrote:

> It seems a bit odd that it's impossible to use merge with prepared statements
> without specifically casting the source types (which I did now to continue my
> experiment).

I have no comments on this.  Maybe it can be improved, but I don't know
how.


Not tested, but the example prepare command fails to make use of the optional data types specification.  Using that should remove the need to cast the parameter placeholder within the query itself.

That said, in theory the INSERT specification of the MERGE could be used to either resolve unknowns or even forcibly convert the data types of the relation produced by the USING clause to match the actual types required for the INSERT (since that will happen at insert time anyway).  This would make the UPDATE behavior slightly different than a top-level UPDATE command though, since that does not have the same context information.

David J.

Re: MERGE and parsing with prepared statements

From
Justin Pryzby
Date:
On Fri, Jul 15, 2022 at 12:17:51PM -0700, David G. Johnston wrote:
> On Fri, Jul 15, 2022 at 11:40 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > On 2022-Jul-15, Justin Pryzby wrote:
> >
> > > It seems a bit odd that it's impossible to use merge with prepared statements
> > > without specifically casting the source types (which I did now to continue my
> > > experiment).
> >
> > I have no comments on this.  Maybe it can be improved, but I don't know
> > how.
>
> Not tested, but the example prepare command fails to make use of the
> optional data types specification.  Using that should remove the need to
> cast the parameter placeholder within the query itself.

What optional data type specification ?

> That said, in theory the INSERT specification of the MERGE could be used to
> either resolve unknowns or even forcibly convert the data types of the
> relation produced by the USING clause to match the actual types required
> for the INSERT (since that will happen at insert time anyway).

Yeah.  I hadn't looked before, but just noticed this:

https://www.postgresql.org/docs/devel/sql-merge.html
| If the expression for any column is not of the correct data type, automatic type conversion will be attempted.

That appears to be copied from the INSERT page.
What does that mean, if not that data types will be resolved as needed ?

Note that if I add casts to the "ON" condition, MERGE complains about the
INSERT VALUES.

PREPARE p AS
MERGE INTO CustomerAccount CA
USING (SELECT $1 AS CustomerId, $2 AS TransactionValue) AS T
ON CA.CustomerId = T.CustomerId::int
WHEN NOT MATCHED THEN
  INSERT (CustomerId, Balance)
  VALUES (T.CustomerId, T.TransactionValue)
WHEN MATCHED THEN
  UPDATE SET Balance = Balance + TransactionValue;

ERROR:  column "customerid" is of type integer but expression is of type text
LINE 7:   VALUES (T.CustomerId, T.TransactionValue)

        postgres: pryzbyj postgres [local] PREPARE(transformAssignedExpr+0x3b6) [0x5605f9699e8e]
        postgres: pryzbyj postgres [local] PREPARE(transformInsertRow+0x2ce) [0x5605f9653a47]
        postgres: pryzbyj postgres [local] PREPARE(transformMergeStmt+0x7ec) [0x5605f968fe3b]
        postgres: pryzbyj postgres [local] PREPARE(transformStmt+0x70) [0x5605f9656071]
        postgres: pryzbyj postgres [local] PREPARE(+0x1fa350) [0x5605f9657350]
        postgres: pryzbyj postgres [local] PREPARE(transformTopLevelStmt+0x11) [0x5605f9657385]
        postgres: pryzbyj postgres [local] PREPARE(parse_analyze_varparams+0x5b) [0x5605f96574f4]
        postgres: pryzbyj postgres [local] PREPARE(pg_analyze_and_rewrite_varparams+0x38) [0x5605f991edfe]
        postgres: pryzbyj postgres [local] PREPARE(PrepareQuery+0xcc) [0x5605f96f4155]
        postgres: pryzbyj postgres [local] PREPARE(standard_ProcessUtility+0x4ea) [0x5605f99251a0]
        postgres: pryzbyj postgres [local] PREPARE(ProcessUtility+0xdb) [0x5605f992587e]

-- 
Justin



Re: MERGE and parsing with prepared statements

From
"David G. Johnston"
Date:
On Fri, Jul 15, 2022 at 12:40 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Fri, Jul 15, 2022 at 12:17:51PM -0700, David G. Johnston wrote:
> On Fri, Jul 15, 2022 at 11:40 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > On 2022-Jul-15, Justin Pryzby wrote:
> >
> > > It seems a bit odd that it's impossible to use merge with prepared statements
> > > without specifically casting the source types (which I did now to continue my
> > > experiment).
> >
> > I have no comments on this.  Maybe it can be improved, but I don't know
> > how.
>
> Not tested, but the example prepare command fails to make use of the
> optional data types specification.  Using that should remove the need to
> cast the parameter placeholder within the query itself.

What optional data type specification ?

The one documented here:


PREPARE name [ ( data_type [, ...] ) ] AS statement

David J.

Re: MERGE and parsing with prepared statements

From
"David G. Johnston"
Date:
On Fri, Jul 15, 2022 at 12:40 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

That appears to be copied from the INSERT page.
What does that mean, if not that data types will be resolved as needed ?

Yep, and the system needs to resolve the type at a point where there is no contextual information and so it chooses text.


Note that if I add casts to the "ON" condition, MERGE complains about the
INSERT VALUES.

PREPARE p AS
MERGE INTO CustomerAccount CA
USING (SELECT $1 AS CustomerId, $2 AS TransactionValue) AS T
ON CA.CustomerId = T.CustomerId::int
WHEN NOT MATCHED THEN
  INSERT (CustomerId, Balance)
  VALUES (T.CustomerId, T.TransactionValue)
WHEN MATCHED THEN
  UPDATE SET Balance = Balance + TransactionValue;

ERROR:  column "customerid" is of type integer but expression is of type text
LINE 7:   VALUES (T.CustomerId, T.TransactionValue)


Noted.  Not surprised.  That error was always present, it's just that the join happens first.  Since your fix narrowly targeted the join this error remained to be discovered.

David J.

Re: MERGE and parsing with prepared statements

From
Justin Pryzby
Date:
On Fri, Jul 15, 2022 at 12:59:34PM -0700, David G. Johnston wrote:
> On Fri, Jul 15, 2022 at 12:40 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> 
> >
> | If the expression for any column is not of the correct data type, automatic type conversion will be attempted.
> > That appears to be copied from the INSERT page.
> > What does that mean, if not that data types will be resolved as needed ?
>
> Yep, and the system needs to resolve the type at a point where there is no
> contextual information and so it chooses text.

I don't know if that's a viable interpretation.  The parser "resolved the type"
by assuming a default, which caused an error before finishing parsing.

In the case of INSERT, "conversion will be attempted" means it looks for a cast
from the source type to the target type, and its "automatic type conversion"
will fail if (for example) you try to insert a timestamp into an int.

In the case of MERGE, that same sentence evidently means that it assumes a
default source type (rather than looking for a cast) and then fails if it
doesn't exactly match the target type.

Should that sentence be removed from MERGE ?

-- 
Justin



Re: MERGE and parsing with prepared statements

From
Justin Pryzby
Date:
On Fri, Jul 15, 2022 at 03:43:41PM -0500, Justin Pryzby wrote:
> Should that sentence be removed from MERGE ?

Also, I think these examples should be more similar.

doc/src/sgml/ref/merge.sgml

> <programlisting>
> MERGE INTO CustomerAccount CA
> USING RecentTransactions T
> ON T.CustomerId = CA.CustomerId
> WHEN MATCHED THEN
>   UPDATE SET Balance = Balance + TransactionValue
> WHEN NOT MATCHED THEN
>   INSERT (CustomerId, Balance)
>   VALUES (T.CustomerId, T.TransactionValue);
> </programlisting>
>   </para>
> 
>   <para>
>    Notice that this would be exactly equivalent to the following
>    statement because the <literal>MATCHED</literal> result does not change
>    during execution.
> 
> <programlisting>
> MERGE INTO CustomerAccount CA
> USING (Select CustomerId, TransactionValue From RecentTransactions) AS T
> ON CA.CustomerId = T.CustomerId
> WHEN NOT MATCHED THEN
>   INSERT (CustomerId, Balance)
>   VALUES (T.CustomerId, T.TransactionValue)
> WHEN MATCHED THEN
>   UPDATE SET Balance = Balance + TransactionValue;
> </programlisting>
>   </para>

The "ON" lines can be the same.
The "MATCHED" can be in the same order.

-- 
Justin



Re: MERGE and parsing with prepared statements

From
Simon Riggs
Date:
On Fri, 12 Aug 2022 at 12:20, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> On 2022-Jul-18, Justin Pryzby wrote:
>
> > On Fri, Jul 15, 2022 at 03:43:41PM -0500, Justin Pryzby wrote:
> > > Should that sentence be removed from MERGE ?
> >
> > Also, I think these examples should be more similar.
>
> Agreed, done.

Sorry, but I disagree with this chunk in the latest commit,
specifically, changing the MATCHED from after to before the NOT
MATCHED clause.

The whole point of the second example was to demonstrate that the
order of the MATCHED/NOT MATCHED clauses made no difference.

By changing the examples so they are the same, the sentence at line
573 now makes no sense.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: MERGE and parsing with prepared statements

From
Alvaro Herrera
Date:
On 2022-Aug-12, Simon Riggs wrote:

> Sorry, but I disagree with this chunk in the latest commit,
> specifically, changing the MATCHED from after to before the NOT
> MATCHED clause.
> 
> The whole point of the second example was to demonstrate that the
> order of the MATCHED/NOT MATCHED clauses made no difference.
> 
> By changing the examples so they are the same, the sentence at line
> 573 now makes no sense.

Hmm, I thought the point of the example was to show that you can replace
the table in the USING clause with a query that retrieves the column;
but you're right, we lost the thing there.  Maybe it was too subtle to
the point that I failed to understand it.  Perhaps we can put it back
the way it was and explain these two differences (change of data source
*and* clause ordering) more explicitly.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La virtud es el justo medio entre dos defectos" (Aristóteles)



Re: MERGE and parsing with prepared statements

From
Justin Pryzby
Date:
On Fri, Aug 12, 2022 at 01:53:25PM +0200, Alvaro Herrera wrote:
> On 2022-Aug-12, Simon Riggs wrote:
> 
> > Sorry, but I disagree with this chunk in the latest commit,
> > specifically, changing the MATCHED from after to before the NOT
> > MATCHED clause.

3d895bc84 also moved a semicolon into the middle of the sql statement.

> > The whole point of the second example was to demonstrate that the
> > order of the MATCHED/NOT MATCHED clauses made no difference.
> > 
> > By changing the examples so they are the same, the sentence at line
> > 573 now makes no sense.
> 
> Hmm, I thought the point of the example was to show that you can replace
> the table in the USING clause with a query that retrieves the column;
> but you're right, we lost the thing there.  Maybe it was too subtle to
> the point that I failed to understand it.  Perhaps we can put it back
> the way it was and explain these two differences (change of data source
> *and* clause ordering) more explicitly.

Evidently I misunderstood it, too.

-- 
Justin



Re: MERGE and parsing with prepared statements

From
Alvaro Herrera
Date:
On 2022-Aug-12, Simon Riggs wrote:

> Sorry, but I disagree with this chunk in the latest commit,
> specifically, changing the MATCHED from after to before the NOT
> MATCHED clause.
> 
> The whole point of the second example was to demonstrate that the
> order of the MATCHED/NOT MATCHED clauses made no difference.
> 
> By changing the examples so they are the same, the sentence at line
> 573 now makes no sense.

Not sure how to fix this.  We could add put the WHEN clauses in the
order they were, and add another phrase to that paragraph to explain
more explicitly that the order is not relevant (while at the same time
explaining that if you have multiple WHEN MATCHED clauses, the relative
order of those *is* relevant).

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"At least to kernel hackers, who really are human, despite occasional
rumors to the contrary" (LWN.net)