Re: [HACKERS] [PATCH] Incremental sort - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: [HACKERS] [PATCH] Incremental sort
Date
Msg-id CAPpHfdvH-LQPVQXLy-fbB5Ko9Svtmwsq9rJyPuO6JjQ2-G1Rxw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [PATCH] Incremental sort  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: [HACKERS] [PATCH] Incremental sort  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
Hi!

On Mon, Nov 20, 2017 at 12:24 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
On Wed, Nov 15, 2017 at 7:42 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> Sure, please find rebased patch attached.

+ /*
+  * Check if first "skipCols" sort values are equal.
+  */
+ static bool
+ cmpSortSkipCols(IncrementalSortState *node, TupleTableSlot *a,
+                                                             TupleTableSlot *b)
+ {
+     int n, i;
+
+     Assert(IsA(node->ss.ps.plan, IncrementalSort));
+
+     n = ((IncrementalSort *) node->ss.ps.plan)->skipCols;
+
+     for (i = 0; i < n; i++)
+     {
+         Datum datumA, datumB, result;
+         bool isnullA, isnullB;
+         AttrNumber attno = node->skipKeys[i].attno;
+         SkipKeyData *key;
+
+         datumA = slot_getattr(a, attno, &isnullA);
+         datumB = slot_getattr(b, attno, &isnullB);
+
+         /* Special case for NULL-vs-NULL, else use standard comparison */
+         if (isnullA || isnullB)
+         {
+             if (isnullA == isnullB)
+                 continue;
+             else
+                 return false;
+         }
+
+         key = &node->skipKeys[i];
+
+         key->fcinfo.arg[0] = datumA;
+         key->fcinfo.arg[1] = datumB;
+
+         /* just for paranoia's sake, we reset isnull each time */
+         key->fcinfo.isnull = false;
+
+         result = FunctionCallInvoke(&key->fcinfo);
+
+         /* Check for null result, since caller is clearly not expecting one */
+         if (key->fcinfo.isnull)
+             elog(ERROR, "function %u returned NULL", key->flinfo.fn_oid);
+
+         if (!DatumGetBool(result))
+             return false;
+     }
+     return true;
+ }

Is there some reason not to use ApplySortComparator for this?  I think
you're missing out on lower-overhead comparators, and in any case it's
probably good code reuse, no?

However, for incremental sort case we don't need to know here whether A > B or B > A.  It's enough for us to know if A = B or A != B.  In some cases it's way cheaper.  For instance, for texts equality check is basically memcmp while comparison may use collation.

Embarrassingly, I was unaware of this patch and started prototyping
exactly the same thing independently[1].  I hadn't got very far and
will now abandon that, but that's one thing I did differently.  Two
other things that may be different: I had a special case for groups of
size 1 that skipped the sorting, and I only sorted on the suffix
because I didn't put tuples with different prefixes into the sorter (I
was assuming that tuplesort_reset was going to be super efficient,
though I hadn't got around to writing that).  I gather that you have
determined empirically that it's better to be able to sort groups of
at least MIN_GROUP_SIZE than to be able to skip the comparisons on the
leading attributes, but why is that the case?

Right.  The issue that not only case of one tuple per group could cause overhead, but few tuples (like 2 or 3) is also case of overhead.  Also, overhead is related not only to sorting.  While investigate of regression case provided by Heikki [1], I've seen extra time spent mostly in extra copying of sample tuple and comparison with that.  In order to cope this overhead I've introduced MIN_GROUP_SIZE which allows to skip copying sample tuples too frequently.


------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
 

pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: [HACKERS] [PATCH] Incremental sort
Next
From: Masahiko Sawada
Date:
Subject: Re: [HACKERS][PATCH]pg_buffercache add a buffer state column, Addfuction to decode buffer state