Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options
Date
Msg-id 20250127.205115.1603241525112208994.ishii@postgresql.org
Whole thread Raw
In response to Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options  (Oliver Ford <ojford@gmail.com>)
List pgsql-hackers
> The attached patch should fix both of these. I've added extra tests
> with a PARTITION BY in the window clause to test for multiple
> partitions.

I have looked through the v5 patch. Here are review comments.

From 5268754b33103fefc511b57ec546103899f70dbe Mon Sep 17 00:00:00 2001
From: Oliver Ford <ojford@gmail.com>
Date: Thu, 23 Jan 2025 20:11:17 +0000
Subject: [PATCH] :ignore nulls

---
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 9a1acce2b5..d93a44633e 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -69,6 +69,10 @@ typedef struct WindowObjectData
     int            readptr;        /* tuplestore read pointer for this fn */
     int64        markpos;        /* row that markptr is positioned on */
     int64        seekpos;        /* row that readptr is positioned on */
+    int            ignore_nulls;    /* ignore nulls */
+    int64       *win_nonnulls;    /* tracks non-nulls in ignore nulls mode */

After ignore_nulls, there will be a 4-byte hole because win_nonnulls
is an 8-byte variable. It would be better to swap them.
 
@@ -1263,6 +1268,15 @@ begin_partition(WindowAggState *winstate)
 
             winobj->markpos = -1;
             winobj->seekpos = -1;
+
+
+            /* reallocate null check */
+            if (perfuncstate->winobj->ignore_nulls == IGNORE_NULLS)
+            {
+                perfuncstate->winobj->win_nonnulls = palloc_array(int64, 16);
+                perfuncstate->winobj->nonnulls_size = 16;

Those 2 lines above are not necessary. Since win_nonnulls are
allocated in ExecInitWindowAgg() in the per query query context, it
survives across partitions. You only need initialize nonnulls_len to
0.

+                perfuncstate->winobj->nonnulls_len = 0;
+            }
         }
     }
 
@@ -1383,7 +1397,9 @@ release_partition(WindowAggState *winstate)
 
         /* Release any partition-local state of this window function */
         if (perfuncstate->winobj)
+        {
             perfuncstate->winobj->localmem = NULL;
+        }

You accidentally added unnecessary curly braces.

@@ -2679,6 +2698,13 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
             winobj->argstates = wfuncstate->args;
             winobj->localmem = NULL;
             perfuncstate->winobj = winobj;
+            winobj->ignore_nulls = wfunc->ignore_nulls;
+            if (winobj->ignore_nulls == PARSER_IGNORE_NULLS)
+            {
+                winobj->win_nonnulls = palloc_array(int64, 16);
+                winobj->nonnulls_size = 16;
+                winobj->nonnulls_len = 0;
+            }

I don't like to see two "16" here. It would better to use #define or
something.

It will be better to declare the prototype of increment_nonnulls,
ignorenulls_getfuncarginpartition, and ignorenulls_getfuncarginframe
in the begging of the file as other static functions already do.

+/*
+ * ignorenulls_getfuncarginframe
+ * For IGNORE NULLS, get the next nonnull value in the frame, moving forward or backward
+ * until we find a value or reach the frame's end.
+ */
+static Datum
+ignorenulls_getfuncarginframe(WindowObject winobj, int argno,

Do you assume that win_nonnulls is sorted by pos? I think it's
necessarily true that pos in win_nonnulls array is sorted. Is that ok?

+    /*
+     * Store previous rows. Only possible in SEEK_HEAD mode
+     */
+    for (i = 0; i < winobj->nonnulls_len; ++i)
+    {
+        int            inframe;
+
+        if (winobj->win_nonnulls[i] < winobj->markpos)

There are too many "winobj->win_nonnulls[i]". You could assign to a
variable "winobj->win_nonnulls[i]" and use the variable.

+            continue;
+        if (!window_gettupleslot(winobj, winobj->win_nonnulls[i], slot))
+            continue;
+
+        inframe = row_is_in_frame(winstate, winobj->win_nonnulls[i], slot);
+        if (inframe <= 0)
+        {
+            if (inframe == -1 && set_mark)
+                WinSetMarkPosition(winobj, winobj->win_nonnulls[i]);

I think in most cases inframe returns 0 and WinSetMarkPosition is not
called. What use case do you have in your mind when inframe is -1?

+check_frame:
+    do
+    {
+        int            inframe;
+
+        if (!window_gettupleslot(winobj, abs_pos, slot))
+            goto out_of_frame;
+
+        inframe = row_is_in_frame(winstate, abs_pos, slot);
+        if (inframe == -1)
+            goto out_of_frame;
+        else if (inframe == 0)
+            goto advance;
+
+        gottuple = window_gettupleslot(winobj, abs_pos, slot);

Do you really need to call window_gettupleslot here? It's already
called above.

--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -576,6 +576,18 @@ typedef struct GroupingFunc
  * Collation information is irrelevant for the query jumbling, as is the
  * internal state information of the node like "winstar" and "winagg".
  */
+
+/*
+ * Null Treatment options. If specified, initially set to PARSER_IGNORE
+ * or PARSER_RESPECT. PARSER_IGNORE_NULLS is then converted to IGNORE_NULLS
+ * if the window function allows the null treatment clause.
+ */
+#define IGNORE_NULLS 4
+#define RESPECT_NULLS 3
+#define PARSER_IGNORE_NULLS 2
+#define PARSER_RESPECT_NULLS 1
+#define NO_NULLTREATMENT 0

This looks strange to me. Why do you start the define value from 4
down to 0? Also there is no place to use RESPECT_NULLS. Do we need it?

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: meson "experimental"?
Next
From: Dean Rasheed
Date:
Subject: Re: Virtual generated columns