Re: POC: converting Lists into arrays - Mailing list pgsql-hackers

From David Rowley
Subject Re: POC: converting Lists into arrays
Date
Msg-id CAKJS1f8=5GBvZp-WoWOU8aMvoELZgWMMe=p54oEAXpi_2-zCNg@mail.gmail.com
Whole thread Raw
In response to Re: POC: converting Lists into arrays  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: POC: converting Lists into arrays
List pgsql-hackers
On Sat, 25 May 2019 at 12:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Now, it turns out that the new formulation of foreach() is really
> strictly equivalent to
>
>         for (int pos = 0; pos < list_length(list); pos++)
>         {
>                 whatever-type item = list_nth(list, pos);
>                 ...
>         }
>
> which means that it could cope fine with deletion of the current
> list element if we were to provide some supported way of not
> incrementing the list index counter.  That is, instead of
> code that looks more or less like this:
>
>         for (int pos = 0; pos < list_length(list); pos++)
>         {
>                 whatever-type item = list_nth(list, pos);
>                 ...
>                 if (delete_cur)
>                 {
>                         list = list_delete_nth_cell(list, pos);
>                         pos--;   /* keep loop in sync with deletion */
>                 }
>         }
>
> we could write, say:
>
>         foreach(lc, list)
>         {
>                 whatever-type item = lfirst(lc);
>                 ...
>                 if (delete_cur)
>                 {
>                         list = list_delete_cell(list, lc);
>                         foreach_backup(lc); /* keep loop in sync with deletion */
>                 }
>         }
>
> which is the same thing under the hood.  I'm not quite sure if that way
> is better or not.  It's more magical than explicitly manipulating a list
> index, but it's also shorter and therefore less subject to typos.

If we're doing an API break for this, wouldn't it be better to come up
with something that didn't have to shuffle list elements around every
time one is deleted?

For example, we could have a foreach_delete() that instead of taking a
pointer to a ListCell, it took a ListDeleteIterator which contained a
ListCell pointer and a Bitmapset, then just have a macro that marks a
list item as deleted (list_delete_current(di)) and have a final
cleanup at the end of the loop.

The cleanup operation can still use memmove, but just only move up
until the next bms_next_member on the deleted set, something like
(handwritten and untested):

void
list_finalize_delete(List *list, ListDeleteIterator *di)
{
    int srcpos, curr, tarpos;

    /* Zero the source and target list position markers */
    srcpos = tarpos = 0;
    curr = -1;
    while ((curr = bms_next_member(di->deleted, curr) >= 0)
    {
        int n = curr - srcpos;
        if (n > 0)
        {
            memmove(&list->elements[tarpos], &list->elements[srcpos],
                                n * sizeof(ListCell));
            tarpos += n;
        }
        srcpos = curr + 1;
    }
    list->length = tarpos;
}

Or maybe we should worry about having the list in an inconsistent
state during the loop?  e.g if the list is getting passed into a
function call to do something.


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Fix link for v12
Next
From: David Rowley
Date:
Subject: Re: Performance issue in foreign-key-aware join estimation