Re: proposal: function parse_ident - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: proposal: function parse_ident
Date
Msg-id CAFj8pRD7Ri-f_kFpuV7es7SE_-zhm0+L2RG7tECu8JibYSmm6w@mail.gmail.com
Whole thread Raw
In response to Re: proposal: function parse_ident  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: proposal: function parse_ident  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
Hi

2015-09-09 21:55 GMT+02:00 Alvaro Herrera <alvherre@2ndquadrant.com>:
Pavel Stehule wrote:

> I cannot to use current SplitIdentifierString because it is designed for
> different purpose - and it cannot to separate non identifier part. But the
> code is simple - and will be cleaned.
>
>  postgres=# select * from parse_ident('"AHOJ".NAZDAR[]'::text);
> ┌───────────────┬───────┐
> │     parts     │ other │
> ╞═══════════════╪═══════╡
> │ {AHOJ,nazdar} │ []    │
> └───────────────┴───────┘
> (1 row)

Um.  Now this is really getting into much of the same territory I got
into with the objname/objargs arrays for pg_get_object_address.  I think
the "other" bit is a very poor solution to that.

If you want to be able to parse names for all kinds of objects, you need
a solution much more complex than this function.  I think a clean
solution returns three sets of things; one is the primary part of the
name, which is an array of text; the second is the secondary name, which
is another array of text; the third is an array of TypeName.

For the name of a relation, only the first of these arrays is used.  For
the name of objects subsidiary to a relation, the first two are used
(the first array is the name of the relation itself, and the second
array is the name of the object; for instance a constraint name, or a
trigger name).

The array of type names is necessary because the parsing of TypeName is
completely unlike parsing of plain names.  You need [] decorator and
typmod.  If you consider objects such as casts, you need two TypeNames
("from" and "to"), hence this is an array and not a single one.  As far
as I recall there are other object types that also need more than one
TypeName.

For the name of a function, you need the first text array, and the array
of TypeName which are the input argument types.

If you don't want to have all this complexity, I think you need to forgo
the idea of the "other" thingy that you propose above, and just concern
yourself with the first bits.  I don't think "AHOJ".NAZDAR[] is an
identifier.

yes, usually I don't need a "other" part. And If I need it, then I can get it as difference against a original string. But sometimes you don't get a clean string - and you have to find a end of identifier. The SplitIdentifierString calculates only with separator char, and it cannot to find end of ident. So little bit modified API can look like

CREATE OR REPLACE FUNCTION parse_ident(str text, strict boolean DEFAULT true) RETURNS text[]

raise exception "syntax error" for '"AHOJ".NAZDAR[]' when "strict" is true
returns "AHOJ".nazdar for '"AHOJ".NAZDAR[]' when "strict" is false

Pavel



--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: [PATCH] SQL function to report log message
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] SQL function to report log message