Re: patch: General purpose utility functions used by the JSON data type - Mailing list pgsql-hackers

From Joseph Adams
Subject Re: patch: General purpose utility functions used by the JSON data type
Date
Msg-id AANLkTinBkt+N+ee=SEaqX8996wjvTBRL0vm8khcnWVmV@mail.gmail.com
Whole thread Raw
In response to Re: patch: General purpose utility functions used by the JSON data type  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: patch: General purpose utility functions used by the JSON data type
Re: patch: General purpose utility functions used by the JSON data type
List pgsql-hackers
On Fri, Aug 13, 2010 at 10:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Aug 13, 2010 at 5:45 AM, Joseph Adams
> <joeyadams3.14159@gmail.com> wrote:
>> getEnumLabelOids
>>  * Useful-ometer: ()-----------------------------------o
>>  * Rationale: There is currently no streamlined way to return a custom
>> enum value from a PostgreSQL function written in C.  This function
>> performs a batch lookup of enum OIDs, which can then be cached with
>> fn_extra.  This should be reasonably efficient, and it's quite elegant
>> to use.
>
> It's possible that there might be a contrib module out there someplace
> that wants to do this, but it's clearly a waste of time for a core
> datatype.  The OIDs are fixed.  Just get rid of the enum altogether
> and use the OIDs directly wherever you would have used the enum.  Then
> you don't need this.

Indeed, a built-in JSON data type will certainly not need it.
However, users may want to return enums from procedures written in C,
and this function provides a way to do it.

> Incidentally, if we were going to accept this function, I think we'd
> need to add some kind of a check to throw an error if any of the
> labels can't be mapped onto an Oid.  Otherwise, an error in this area
> might lead to difficult-to-find misbehavior.

I agree.  Perhaps an ereport(ERROR, ...) would be better, since it
would not be hard for a user to cause an enum mapping error (even in a
production database) by updating a stored procedure in C but not
updating the corresponding enum (or vice versa, if enum labels are
renamed).

>> FN_EXTRA, FN_EXTRA_ALLOC, FN_MCXT
>>  * Useful-ometer: ()--------------------o
>>  * Rationale: Using fcinfo->flinfo->fn_extra takes a lot of
>> boilerplate.  These macros help cut down the boilerplate, and the
>> comment explains what fn_extra is all about.
>
> I think this is not a good idea.  Macros that use local variables
> under the covers make it hard for new contributors to read the code
> and understand what it does.  It's only worth doing if it saves a lot
> of typing, and this doesn't.  If we were going to do this, the right
> thing for your patch to do would be to update every instance of this
> coding pattern in our source base, so that people who copy those
> examples in the future do it the new way.  But I think there's no real
> advantage in that: it would complicate back-patching future bug fixes
> for no particularly compelling reason.

Fair enough.  Perhaps the comment about FN_EXTRA (which explains
fn_extra in more detail) could be reworded (to talk about using
fcinfo->flinfo->fn_extra manually) and included in the documentation
at xfunc-c.html .  fn_extra currently only gets a passing mention in
the discussion about set-returning functions.

>> pg_substring, pg_encoding_substring
>>  * Useful-ometer: ()-------o
>>  * Rationale: The JSONPath code uses it / will use it for extracting
>> substrings, which is probably not a very useful feature (but who am I
>> to say that).  This function could probably benefit the
>> text_substring() function in varlena.c , but it would take a bit of
>> work to ensure it continues to comply with standards.
>
> I'm more positive about this idea.  If you can make this generally
> useful, I'd encourage you to do that.  On a random coding style note,
> I see that you have two copies of the following code, which can fairly
> obviously be written in a single line rather than five, assuming it's
> actually safe.
>
> +               if (sub_end + len > e)
> +               {
> +                       Assert(false);          /* Clipped multibyte
> character */
> +                       break;
> +               }

If I simply say Assert(sub_end + len <= e), the function will yield a
range hanging off the edge of the input string (out of bounds).  The
five lines include a safeguard against that when assertion checking is
off.  This could happen if the input string has a clipped multibyte
character at the end.  Perhaps I should just drop the assertions and
make it so if there's a clipped character at the end, it'll be
ignored, no big deal.


Joey Adams


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: including backend ID in relpath of temp rels - updated patch
Next
From: Tom Lane
Date:
Subject: Re: including backend ID in relpath of temp rels - updated patch