Re: UNNEST ... WITH ORDINALITY (AND POSSIBLY OTHER STUFF) - Mailing list pgsql-hackers

From David Fetter
Subject Re: UNNEST ... WITH ORDINALITY (AND POSSIBLY OTHER STUFF)
Date
Msg-id 20101119190602.GA5168@fetter.org
Whole thread Raw
In response to Re: UNNEST ... WITH ORDINALITY (AND POSSIBLY OTHER STUFF)  (<Caleb.Welton@emc.com>)
Responses Re: UNNEST ... WITH ORDINALITY (AND POSSIBLY OTHER STUFF)
List pgsql-hackers
On Fri, Nov 19, 2010 at 01:48:06PM -0500, Caleb.Welton@emc.com wrote:
> Note the standard also supports unnesting multiple arrays
> concurrently, the rule for handling arrays with different lengths is
> to use null padding of the shorter array.

Interesting.  I notice that our version doesn't support multiple-array
UNNEST just yet.

SELECT * FROM UNNEST(ARRAY[1,2,3,4], ARRAY['hello','world']);
ERROR:  function unnest(integer[], text[]) does not exist
LINE 1: SELECT * FROM UNNEST(ARRAY[1,2,3,4], ARRAY['hello','world'])...                     ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

> 
> SELECT * FROM
>    UNNEST( ARRAY[5,2,3,4],
>            ARRAY['hello', 'world'] )
>    WITH ORDINALITY AS t(a,b,i);
> 
> a     b     i
> ---   ---------- ------
> 5  'hello'  1
> 2  'world'  2
> 3           3
> 4           4
> (4 rows)

This looks a lot like an OUTER JOIN on the ORDINALITY column of each
of the individual UNNEST...WITH ORDINALITYs.  Given that we know the
ORDINALITY in advance just by building the arrays, we could optimize
this away from FULL JOIN to LEFT (or RIGHT) JOINs.

> To implement this it is not just substituting the existing unnest(anyarray) function in multiple times.

Right.

> 
> Regards,
>    Caleb
> 
> On Nov 19, 2010, at 4:50 AM, <pgsql-hackers-owner@postgresql.org<mailto:pgsql-hackers-owner@postgresql.org>>
<pgsql-hackers-owner@postgresql.org<mailto:pgsql-hackers-owner@postgresql.org>>wrote:
 
> 
> From: David Fetter <david@fetter.org<mailto:david@fetter.org>>
> Date: November 18, 2010 11:48:16 PM PST
> To: Itagaki Takahiro <itagaki.takahiro@gmail.com<mailto:itagaki.takahiro@gmail.com>>
> Cc: PG Hackers <pgsql-hackers@postgresql.org<mailto:pgsql-hackers@postgresql.org>>
> Subject: Re: UNNEST ... WITH ORDINALITY (AND POSSIBLY OTHER STUFF)
> 
> 
> On Fri, Nov 19, 2010 at 11:40:05AM +0900, Itagaki Takahiro wrote:
> On Fri, Nov 19, 2010 at 08:33, David Fetter <david@fetter.org<mailto:david@fetter.org>> wrote:
> In order to get WITH ORDINALITY, would it be better to change
> gram.y to account for both WITH ORDINALITY and without, or just
> for the WITH ORDINALITY case?
> 
> We probably need to change gram.y and make UNNEST to be
> COL_NAME_KEYWORD.  UNNEST (without ORDINALITY) will call the
> existing unnest() function, and UNNEST() WITH ORDINALITY will call
> unnest_with_ordinality().
> 
> Thanks for sketching that out :)
> 
> BTW, what will we return for arrays with 2 or more dimensions?
> 
> At the moment, per the SQL standard, UNNEST without the WITH
> ORDINALITY clause flattens all dimensions.
> 
> SELECT * FROM UNNEST(ARRAY[[1,2],[3,4]]);
> unnest
> --------
>      1
>      2
>      3
>      4
> (4 rows)
> 
> Unless we want to do something super wacky and contrary to the SQL
> standard, UNNEST(array) WITH ORDINALITY should do the same.
> 
> There are no confusion in your two arguments version:
> UNNEST(anyarray, number_of_dimensions_to_unnest)
> but we will also support one argument version. Array indexes will
> be composite numbers in the cases. The possible design would be just
> return sequential serial numbers of the values -- the following two
> queries return the same results:
> 
> - SELECT i, v FROM UNNEST($1) WITH ORDINALITY AS t(v, i)
> - SELECT row_number() OVER () AS i, v FROM UNNEST($1) AS t(v)
> 
> Yes, that's what the standard says.  Possible less-than-total
> unrolling schemes include:
> 
> - Flatten specified number of initial dimensions into one list, e.g.
>  turn UNNEST(array_3d, 2) into SETOF(array_1d) with one column of
>  ORDINALITY
> 
> - Flatten similarly, but have an ORDINALITY column for each flattened
>  dimension.
> 
> - More exotic schemes, such as UNNEST(array_3d, [1,3]), with either of
>  the two methods above.
> 
> And of course the all-important:
> 
> - Other possibilities I haven't thought of :)
> 
> Cheers,
> David.
> --
> David Fetter <david@fetter.org<mailto:david@fetter.org>> http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fetter@gmail.com<mailto:david.fetter@gmail.com>
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
> 
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
> 

-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Latches with weak memory ordering (Re: max_wal_senders must die)
Next
From: Pavel Stehule
Date:
Subject: how correctly detoast a Datum value?