> One thing I worry about the patch is, now the non-nulls array
> optimization was removed. Since then I have been thinking about if
> there could be other way to optimize searching for non null rows.
Here is the v12 patch to implement the optimization on top of Oliver's
v11 patch. Only src/backend/executor/nodeWindowAgg.c was modified
(especially ignorenulls_getfuncarginframe). In the patch I created
2-bit not null information array, representing following status for
each row:
UNKNOWN: the row is not determined whether it's NULL or NOT yet.
This is the initial value.
NULL: the row has been determined to be NULL.
NOT NULL: the row has been determined to be NOT NULL.
In ignorenulls_getfuncarginframe:
For the first time window function visits a row in a frame, the row is
fetched using window_gettupleslot() and it is checked whether it is in
the frame using row_is_in_frame(). If it's in the frame and the
information in the array is UNKNOWN, ExecEvalExpr() is executed to
find out if the expression on the function argument is NULL or
not. And the result (NULL or NOT NULL) is stored in the array.
If the information in the array is not UNKNOWN, we can skip calling
ExecEvalExpr() because the information is already in the array.
Note that I do not skip calling window_gettupleslot() and
row_is_in_frame(), skip only calling ExecEvalExpr(), because whether a
row is in a frame or not could be changing as the current row position
moves while processing window functions.
With this technique I observed around 40% speed up in my environment
using the script attached, comparing with Oliver's v11 patch.
v11:
rows duration (msec)
1000 41.019
2000 148.957
3000 248.291
4000 442.478
5000 687.395
v12:
rows duration (msec)
1000 27.515
2000 78.913
3000 174.737
4000 311.412
5000 482.156
The patch is now generated using the standard git format-patch. Also
I have slightly adjusted the coding style so that it aligns with the
one used in nodeWindowAgg.c, and ran pgindent.
Note that I have not modified ignorenulls_getfuncarginpartition yet. I
think we could optimize it using the not null info infrastructure as
well. Will come up with it.
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
for i in 1000 2000 3000 4000 5000
do
echo "$i rows: "
pos=`expr $i / 2`
psql -a test <<EOF
\timing
explain analyze
SELECT
x,
nth_value(x,$pos) IGNORE NULLS OVER w
FROM generate_series(1,$i) g(x)
WINDOW w AS (ORDER BY x ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING);
EOF
done | egrep "rows:|Time:" | egrep -v "Planning|Execution"|
sed -e 's/rows: *//' -e 's/Time: //' -e 's/ms//'