Re: MULTISET patch - Mailing list pgsql-hackers

From Itagaki Takahiro
Subject Re: MULTISET patch
Date
Msg-id AANLkTikhWYXKX1+J4LLqcGRdwXpQekk=nDDxyBSfJ2R6@mail.gmail.com
Whole thread Raw
In response to Re: MULTISET patch  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: MULTISET patch  ("Erik Rijkers" <er@xs4all.nl>)
List pgsql-hackers
Here is an updated patch for MULTISET functions support.
- Fixed a bug in MULTISET EXCEPT ALL.
- Fixed a bug of NULL handling in SUBMULTISET OF.
- Removed array_flatten() from exported function list. It is still
  used internally, but I doubt it would be useful in SQL level.

On Mon, Dec 27, 2010 at 21:27, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> * trim_array - you use a deconstruct_array. It unpack  all fields and
>>> it could not be effective. Can we limit a unpacked array?

I modified the logic to use existing array_get_slice(),
that uses memcpy to make a sliced array without unpacking.

> DB2 use a both names.  Almost all array functions has a name array_* .
> Sure, standard is primary target. It's only my idea, so we can have a
> both names. It's not significant.

We also use array_xxx for array function names, but TRIM_ARRAY()
is the standard... Since slice syntax ARRAY[lo:hi] should be more
useful  than trim_array(), I don't have a plan to add the alias.

> * three state boolean - true, false, -1. I am not sure, if this is
> correct style. Using a second variable can be a more clean

I modified the code to use two booleans.

> * you doesn't a realese a deconstructed array

I don't think it needs to be fixed because functions are executed
in per-tuple memory context in normal cases.


On Mon, Dec 27, 2010 at 17:05, Simon Riggs <simon@2ndquadrant.com> wrote:
> I think collect() is non-identical to fusion() but the tests don't
> highlight that, so I think we need more tests to highlight null
> handling.

collect() exactly equals to array_agg() in the current implementation.
It should return a MULTISET value instead of ARRAY in the spec,
but we don't have MULTISET data type for now.

> You mention that SQL standard doesn't handle multi-dim arrays,
> so I think we need some tests to define and check that behaviour.

I added additional tests for multi-dimensional arrays and NULL elements.

> Without any docs the only point of reference to understand the
> PostgreSQL implementation is by reading the tests. IMHO docs explain a
> patch and make a review possible; they aren't something that can be
> written after a review. Perhaps the nag doesn't apply as much when you
> supply external references, but reviewer shouldn't have to read all of
> those again. Of course, no need for docs to be perfect, just enough to
> understand.

I added some comments mainly in "Array Functions and Operators" section.
The most debatable part is whether arrays should be flattened or not.
I wrote the patch to flatten always -- for example,
  trim_array( [[1,2],[3,4]], 1) => [1,2,3]
but we could design it to keep the dimension:
  trim_array( [[1,2],[3,4]], 1) => [[1,2]]

However, it's not so easy for comparison function (sort, set, ...) to
decide what we should do. Note that the existing <@ operator also ignored
dimensions in arrays:

=# SELECT ARRAY[[1,3],[2,5]] <@ ARRAY[[1,2,3],[4,5,6]];
 ?column?
----------
 t

--
Itagaki Takahiro

Attachment

pgsql-hackers by date:

Previous
From: Marti Raudsepp
Date:
Subject: Re: Streaming base backups
Next
From: Robert Haas
Date:
Subject: Re: sepgsql contrib module