Home > mailing lists

Re: proposal: row_to_array function - Mailing list pgsql-hackers

From	Pavel Stehule
Subject	Re: proposal: row_to_array function
Date	June 22, 2015 07:47:02
Msg-id	CAFj8pRA8Z76_gSN39UBHuH3sQssE+3x-N8Rhgjvd_9wEzYgfng@mail.gmail.com Whole thread Raw
In response to	Re: proposal: row_to_array function (Craig Ringer <craig@2ndquadrant.com>)
Responses	Re: proposal: row_to_array function
List	pgsql-hackers

Tree view

2015-06-22 5:18 GMT+02:00 Craig Ringer <craig@2ndquadrant.com>:

On 2 April 2015 at 01:59, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Sun, Mar 29, 2015 at 1:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Pavel Stehule <pavel.stehule@gmail.com> writes:
>>> here is rebased patch.
>>> It contains both patches - row_to_array function and foreach array support.
>>
>> While I don't have a problem with hstore_to_array, I don't think that
>> row_to_array is a very good idea; it's basically encouraging people to
>> throw away SQL datatypes altogether and imagine that everything is text.
>> They've already bought into that concept if they are using hstore or
>> json, so smashing elements of those containers to text is not a problem.
>> But that doesn't make this version a good thing.
>>
>> (In any case, those who insist can get there through row_to_json, no?)
>
> You have a point. What does attached do that to_json does not do
> besides completely discard type information? Our json api is pretty
> rich and getting richer. For better or ill, we dumped all json
> support into the already stupendously bloated public namespace and so
> it's always available.

I can see plenty of utility for a function like Pavel speaks of, but
I'd personally rather see it as a function that returns table (colname
name, coltype regtype, coltypmod integer, coltextvalue text,
colordinal integer) so it can carry more complete information and
there's no need to worry about foreach(array). The main use of a
function that includes text representations of the values would IMO be
using it from plain SQL, rather than PL/PgSQL, when faced with
anonymous records.

I'd find it more useful to have lvalue-expressions for dynamic access
to record fields and a function to get record metadata - field names,
types and typmods. Some kind of "pg_get_record_info(record) returns
table(fieldname text, fieldtype regtype, fieldtypmod integer)" and a
PL/PgSQL lvalue-expression for record field access like
"RECORD_FIELD(therecord, fieldname)". I would _certainly_ want to be
able to get the type metadata without the values.

That way you could interact natively with the fields in their true
types, without forcing conversion into and out of 'text', which is a
known performance pain-point with PL/PgSQL. (PL/PgSQL doesn't have a
VARIANT type or support for using 'anyelement', which would be the
other way to solve the type flattening problem IMO).

Think:

DECLARE
myrow record;
fi record;
BEGIN
EXECUTE user_supplied_dynamic_query INTO myrow;
FOR fi IN
SELECT fieldname, fieldtype, fieldtypmod
FROM pg_get_record_info(myrow)
LOOP
IF fi.fieldtype == 'int4'::regtype THEN
RECORD_FIELD(myrow, fi.fieldname) := RECORD_FIELD(myrow,
fi.fieldname) + 1;
END IF;
END LOOP;
END;

I am thinking so this is separate task, that should not be solved simply too. I wrote a set functions for working with record (https://github.com/okbob/pltoolbox/blob/master/record.c). But it doesn't solve the basic issues:

1. speed - FOR IN SELECT FROM is more expensive then just unpacking row or record

2. unclean game with creating more code path for any special type.

I have little bit different idea. FOR IN RECORD can change type of any automatic variable in any iteration. Internally we can do more code paths - so your code can be rewritten to

FOREACH key, val IN RECORD myrow

LOOP

IF pg_typeof(val) IN ('int4', 'double precision', 'numeric') THEN

val := val + 1; -- these variables can be mutable

-- or maybe in futore

myrow[key] := val + 1;

END IF;

END LOOP;

What is important - "val" is automatic variable, and it can has different type in any step.

It is little bit strange, but impossible to solve, so we cannot to support row[var] as right value (without immutable casting). But we can do it with left value.

OK, so it's a stupid example - increment all int4 fields by one. It
conveys the rough idea though - native use of the field types.

Note that RECORD_FIELD is distinct from the existing support for

EXECUTE format('SELECT $1.%I', fieldname) USING therecord;

in that that approach doesn't work for all ways that a record can be
produced, it's slow, it doesn't have a good way to enumerate field
names, and there's no equivalent to write to the field. Current
approaches for that are ghastly:
http://stackoverflow.com/q/7711432/398670 .

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

From: Fabien COELHO
Date: 22 June 2015, 07:29:54
Subject: Re: checkpointer continuous flushing

From: Fabien COELHO
Date: 22 June 2015, 08:11:16
Subject: Re: checkpointer continuous flushing

Re: proposal: row_to_array function - Mailing list pgsql-hackers

Previous

Next