Re: proposal - function string_to_table - Mailing list pgsql-hackers
| From | Peter Smith |
|---|---|
| Subject | Re: proposal - function string_to_table |
| Date | |
| Msg-id | CAHut+PsKQu429JKc3ynysfMXrD2G7wVtLW+EytO+GX-6Jw=UwA@mail.gmail.com Whole thread |
| In response to | Re: proposal - function string_to_table (Pavel Stehule <pavel.stehule@gmail.com>) |
| Responses |
Re: proposal - function string_to_table
|
| List | pgsql-hackers |
Hi.
I have been looking at the patch: string_to_table-20200706-2.patch
Below are some review comments for your consideration.
====
COMMENT func.sgml (style)
+ <para>
+ splits string into table using supplied delimiter and
+ optional null string.
+ </para>
The format style of the short description is inconsistent with the
other functions.
e.g. Should start with Capital letter.
e.g. Should tag the parameter names properly
Something like:
<para>
Splits <parameter>string</parameter> into a table
using supplied <parameter>delimiter</parameter>
and optional null string <parameter>nullstr</parameter>.
</para>
====
COMMENT func.sgml (what does nullstr do)
The description does not sufficiently describe the purpose/behaviour
of the nullstr.
e.g. Firstly I thought that it meant if 2 consecutive delimiters were
encountered it would substitute this string as the row value. But it
is doing the opposite of what I guessed - if the extracted row value
is the same as nullstr then a NULL row is inserted instead.
====
COMMENT func.sgml (wrong sample output)
+<programlisting>xx
+yy,
+zz</programlisting>
This output is incorrect for the sample given. There is no "yy," in
the output because there is a 'yy' nullstr substitution.
Should be:
---
xx
NULL
zz
---
====
COMMENT func.sgml (related to regexp_split_to_table)
Because this new function is similar to the existing
regexp_split_to_table, perhaps they should cross-reference each other
so a reader of this documentation is made aware of the alternative
function?
====
COMMENT (test cases)
It is impossible to tell difference in the output between empty
strings and nulls currently, so maybe you can change all the tests to
have a form like below so they can be validated properly:
# select v, v IS NULL as "is null" from
string_to_table('a,b,*,c,d,',',','*') g(v);
v | is null
---+---------
a | f
b | f
| t
c | f
d | f
| f
(6 rows)
or maybe like this is even easier:
# select quote_nullable(string_to_table('a|*||c|d|','|','*'));
quote_nullable
----------------
'a'
NULL
''
'c'
'd'
''
(6 rows)
Something similar was already proposed before [1] but that never got
put into the test code.
[1] https://www.postgresql.org/message-id/CAFj8pRDSzDYmaS06dfMXBfbr8x%2B3xjDJxA5kbL3h8%2BeOGoRUcA%40mail.gmail.com
====
COMMENT (test cases)
There are multiple combinations of the parameters to this function and
MANY different results depending on different values they can take, so
the existing tests only cover a small sample.
I have attached a lot more test scenarios that you may want to include
for better test coverage. Everything seemed to work as expected.
PSA test-cases.pdf
====
COMMENT (accum_result)
+ Datum values[1];
+ bool nulls[1];
+
+ values[0] = PointerGetDatum(result_text);
+ nulls[0] = is_null;
Why not use variables instead of arrays with only 1 element?
====
COMMENT (text_to_array_internal)
+ if (!tstate.astate)
+ PG_RETURN_ARRAYTYPE_P(construct_empty_array(TEXTOID));
Maybe the condition is more readable when expressed as:
if (tstate.astate == NULL)
====
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachment
pgsql-hackers by date: