Re: Manipulating complex types as non-contiguous structures in-memory - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Manipulating complex types as non-contiguous structures in-memory
Date
Msg-id 20150509224454.GG12950@alap3.anarazel.de
Whole thread Raw
In response to Re: Manipulating complex types as non-contiguous structures in-memory  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Manipulating complex types as non-contiguous structures in-memory  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

> The attached updated patch reduces both of those do-loop tests to about
> 60 msec on my machine.  It contains two improvements over the 1.1 patch:

Looking at this. First reading the patch to understand the details.

* The VARTAG_IS_EXPANDED(tag) trick in VARTAG_SIZE is unlikely to beneficial, before the compiler could implement the
wholething as a computed goto or lookup table, afterwards not.
 
* It'd be nice if the get_flat_size comment in expandeddatm.h could specify whether the header size is included. That
variesenough around toast that it seems worthwhile.
 
* You were rather bothered by the potential of multiple evaluations for the ilist stuff. And now the AARR macros are
fullof them...
 
* I find the ARRAY_ITER_VARS/ARRAY_ITER_NEXT macros rather ugly. I don't buy the argument that turning them into
functionswill be slower. I'd bet the contrary on common platforms.
 
* Not a fan of the EH_ prefix in array_expanded.c and EOH_ elsewhere. Just looks ugly to me. Whatever.
* The list of hardwired safe ops in exec_check_rw_parameter is somewhat sad. Don't have a better idea though.
* "Also, a C function that is modifying a read-write expanded value in-place should take care to leave the value in a
sanestate if it fails partway through." - that's a pretty hefty requirement imo.  I wonder if it'd not be possible to
convertRW to RO if a value originates from outside an exception block.  IIRC that'd be useful for a bunch of other
errorcases we currently basically shrug away (something around toast and aborted xacts comes to mind).
 
* The forced RW->RO conversion in subquery scans is a bit sad, but I seems like something left for later.

These are more judgement calls than anything else...

Somewhere in the thread you comment on the fact that it's a bit sad that
plpgsql is the sole benefactor of this (unless some function forces
expansion internally).  I'd be ok to leave it at that for now. It'd be
quite cool to get some feedback from postgis folks about the suitability
of this for their cases.

I've not really looked into performance improvements around this,
choosing to look into somewhat reasonable cases where it'll
regress.

ISTM that the worst case for the new situation is large arrays that
exist as plpgsql variables but are only ever passed on. Say e.g. a
function that accepts an array among other parameters and passes it on
to another function.

As rather extreme case of this:
CREATE OR REPLACE FUNCTION plpgsql_array_length(p_a anyarray)
RETURNS int LANGUAGE plpgsql AS $$
BEGIN   RETURN array_length(p_a, 1);
END; $$;

SELECT plpgsql_array_length(b.arr)
FROM (SELECT array_agg(d) FROM generate_series(1, 10000) d) b(arr),  generate_series(1, 100000) repeat;
with \o /dev/null redirecting the output.

in an assert build it goes from 325.511 ms to 655.733 ms
optimized from 94.648 ms to 287.574 ms.

Now this is a fairly extreme example; and I don't think it'll get much
worse than that. But I do think there's a bunch of cases where values
exist in plpgsql that won't actually be accessed. Say, e.g. return
values from queries that are then conditionally returned and such.

I'm not sure it's possible to do anything about that. Expanding only in
cases where it'd be beneficial is going to be hard.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Auditing extension for PostgreSQL (Take 2)
Next
From: Stephen Frost
Date:
Subject: Re: Default Roles (was: Additional role attributes)