Re: split_part for the last element - Mailing list pgsql-general

From Nikhil Benesch
Subject Re: split_part for the last element
Date
Msg-id CAPWqQZRCW68sz15HvyRfYMc9AoHrH87B_vqu7xDHMt0x3Fig_Q@mail.gmail.com
Whole thread Raw
In response to Re: split_part for the last element  (PALAYRET Jacques <jacques.palayret@meteo.fr>)
List pgsql-general
Right, that's option 2 in my original mail. There are several
deficiencies with that idiom:

  * It is non-obvious. Sure, it might make sense to you and I, but to
someone just learning SQL, it takes a minute to reason through why it
works. They're also unlikely to invent the trick on their own.
  * It is inefficient. When the strings are large reversing the
strings is a silly waste of compute.

On Fri, Oct 23, 2020 at 12:03 PM PALAYRET Jacques
<jacques.palayret@meteo.fr> wrote:
>
> Hello,
>
>   reverse(split_part(reverse('foo bar baz'), ' ', 1))   -> 'baz'
>
> Regards
>
> ----- Mail original -----
> De: "Nikhil Benesch" <nikhil.benesch@gmail.com>
> À: pgsql-general@lists.postgresql.org
> Envoyé: Vendredi 23 Octobre 2020 17:47:16
> Objet: split_part for the last element
>
> Hi,
>
> Suppose I need to split a string on a delimiter and select one of the
> resulting components. If I want a specific component counting from the
> start, that's easy:
>
>     split_part('foo bar baz', ' ', 1) -> 'foo'
>
> But if I want the last component, I have several less-than-ideal options:
>
> 1. (string_to_array('foo bar baz', '
> '))[cardinality(string_to_array('foo bar baz', ' ')) - 1]
> 2. reverse(split_part(reverse('foo bar baz'), ' ', 1))
> 3. (regexp_match('foo baz bar', '\S*$'))[1]
>
> Option 1 is probably the most understandable, especially if you are
> willing to introduce a temporary parts array:
>
>     select parts[cardinality(parts) - 1] from string_to_array('foo bar
> baz', ' ') parts
>
> But if the strings are long, this needlessly builds an array just to
> throw it away. Option 2 has similar efficiency problems and is just
> kind of silly. Option 3 is probably the best, but it's still a good
> bit more complicated than a simple split_part invocation.
>
> Is there another option I'm missing? Would there be interest in
> extending split part so that negative indices counted from the end, as
> in:
>
>     split_part('foo bar baz', ' ', -1) -> 'baz'
>
> Or adding a split_part_end function in which positive indices counted
> from the end:
>
>     split_part_end('foo bar baz', ' ', 1) -> 'baz'
>
> I'd be happy to prepare a patch if so.
>
> Cheers,
> Nikhil
>
>



pgsql-general by date:

Previous
From: PALAYRET Jacques
Date:
Subject: Re: split_part for the last element
Next
From: "David G. Johnston"
Date:
Subject: Re: split_part for the last element