Thread: migrating from mysql: need to convert empty string to null

migrating from mysql: need to convert empty string to null

From
"Dave Lee"
Date:
Hi,

We have an existing (PHP) code base that is being converted to use
PostgreSQL from MySQL. In most places, our insert and update
statements are formed using single quoted values, and when there is no
value, the empty string is being passed in. PostgreSQL objects to
empty strings for certain column types, particularly numeric columns.
We're trying to get this conversion done quickly, and so a solution
involving customization of string to X type conversion is what we're
after.

What I first tried to do was,

CREATE OR REPLACE FUNCTION varchar_to_int_with_empty_string_handling(varchar)
RETURNS integer AS $$
  SELECT CASE
    WHEN $1 = '' THEN NULL
    ELSE pg_catalog.int4($1)
  END
$$ IMMUTABLE LANGUAGE SQL;

DROP CAST IF EXISTS (varchar AS integer);
CREATE CAST (varchar AS integer)
WITH FUNCTION varchar_to_int_with_empty_string_handling(varchar)
AS ASSIGNMENT;


This seems broken, when loading this file a second or subsequent time
(we append mysql compat functions to it as we progress), there is an
error saying pg_catalog.int4 doesn't exist. So somehow, the cast above
is deleting/disabling/hiding pg_catalog.int4?

While experimenting, trying to understand what I'm doing wrong, I ran
this query:

SELECT castsource::regtype, casttarget::regtype,
castfunc::regprocedure, castcontext
FROM pg_cast
WHERE casttarget = 'int'::regtype

and I notice that there isn't any rows specified for converting
varchar or text to int. Which raises the question, if I run:

SELECT '123'::int;

What conversion is actually happening here?

Any answers are much appreciated,

thanks
Dave

Re: migrating from mysql: need to convert empty string to null

From
Tom Lane
Date:
"Dave Lee" <davelee.com@gmail.com> writes:
> and I notice that there isn't any rows specified for converting
> varchar or text to int. Which raises the question, if I run:

> SELECT '123'::int;

> What conversion is actually happening here?

None.  There is no type coercion there --- what that is really
specifying is invocation of the int data type's input function
on the given string.

I don't think your approach is going to work; you would have to
make your client programs emit things like

    insert into mytab(integercol) values (''::varchar)

in order to get a runtime varchar-to-int cast to be applied.
At that point you might as well fix them properly.

            regards, tom lane

Re: migrating from mysql: need to convert empty string to null

From
"Dave Lee"
Date:
On Wed, Jun 18, 2008 at 12:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> None.  There is no type coercion there --- what that is really
> specifying is invocation of the int data type's input function
> on the given string.

I thought that something like this may be this was the case.

> I don't think your approach is going to work; you would have to
> make your client programs emit things like
>
>        insert into mytab(integercol) values (''::varchar)
>
> in order to get a runtime varchar-to-int cast to be applied.
> At that point you might as well fix them properly.

I see. Other than directly modifying int4in (is this the one?), is
there a way to plug-in our modified empty string handling logic? I'm
picturing a scenario where we write write a wrapper function that
tests for empty strings and returns NULL, else just calls into int4in,
and then dynamically load this function, and finally update the
corresponding rows in pg_proc? Is this even possible?

thanks for your explanation Tom,

Dave

Re: migrating from mysql: need to convert empty string to null

From
Tom Lane
Date:
"Dave Lee" <davelee.com@gmail.com> writes:
> I see. Other than directly modifying int4in (is this the one?), is
> there a way to plug-in our modified empty string handling logic? I'm
> picturing a scenario where we write write a wrapper function that
> tests for empty strings and returns NULL, else just calls into int4in,
> and then dynamically load this function, and finally update the
> corresponding rows in pg_proc? Is this even possible?

There isn't any really nice way to do that :-(.  You could put a wrapper
function around int4in but it would not help, because the internal API
for datatype input functions doesn't support having them return NULL
(see InputFunctionCall in the sources).  If you don't want to fix your
clients then you'll really have little choice but to hack the Postgres
source code.  The good news is there are probably not very places you'd
have to change, depending on exactly what contexts you need this to
happen in.

            regards, tom lane

Re: migrating from mysql: need to convert empty string to null

From
"Dave Lee"
Date:
On Wed, Jun 18, 2008 at 3:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> There isn't any really nice way to do that :-(.  You could put a wrapper
> function around int4in but it would not help, because the internal API
> for datatype input functions doesn't support having them return NULL
> (see InputFunctionCall in the sources).

I've read src/backend/utils/fmgr/README and it states that returning
NULL is just a matter of setting isnull to true in the
FunctionCallInfo struct, and provides a convenience macro,
PG_RETURN_NULL. But then, in InputFunctionCall, I presume you're
referring to the logic that that only allows NULL to be returned if
the source input string is also NULL, otherwise raising an error. At a
basic level, you're suggesting I need to take out this NULL check in
InputFunctionCall to get the capability I'm looking for?

Any idea why the input functions were designed this way? For users
migrating from other RDBMs, if no one else, being able to return NULL
from an input function seems to be a valuable feature. I might use
this hacked functionality to return NULL for '0000-00-00' dates which
MySQL foolishly accepts. Maybe it's only useful to former MySQL users

thanks,
Dave

Re: migrating from mysql: need to convert empty string to null

From
Tom Lane
Date:
"Dave Lee" <davelee.com@gmail.com> writes:
> I've read src/backend/utils/fmgr/README and it states that returning
> NULL is just a matter of setting isnull to true in the
> FunctionCallInfo struct, and provides a convenience macro,
> PG_RETURN_NULL. But then, in InputFunctionCall, I presume you're
> referring to the logic that that only allows NULL to be returned if
> the source input string is also NULL, otherwise raising an error. At a
> basic level, you're suggesting I need to take out this NULL check in
> InputFunctionCall to get the capability I'm looking for?

No, I'm stating it as a fact that that won't work.  You'd have to
change InputFunctionCall's API so that it could pass back a null to
its callers.

> Any idea why the input functions were designed this way?

Per above, and the fact that the behavior you're proposing is a
complete violation of the SQL spec (yeah, I know Oracle does it too).

            regards, tom lane